UM0304
User manual
STR91x DSP library (DSPLIB)
Introduction
This manual presents a library of ARM assembly source code modules for digital signal
processing (DSP) applications such as infinite impulse response (IIR) filter, finite impulse
response (FIR) filter and fast Fourier transform (FFT) applicable for a range of DSP
applications including VSLP vocoder. These assembly source code modules are presented
for ARM mode and have been tested in an ARM9E-based STR91x platform.
In addition, the assembly source code modules have been tested in an IAR Workbench
environment as well, but STMicroelectronics cannot guarantee that these assembly source
code modules will be flawless for all applications.
The algorithm modules are presented "as is with no warranty".
June 2008
Rev 3
1/19
www.st.com
Contents
UM0304
Contents
1
Definitions and related documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1
1.2
1.3
Acronyms and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
ARM and Thumb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2
IIR ARMA 16-bit filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1
2.2
2.3
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Arguments and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1
2.3.2
Calling the function from C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Calling the function from assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4
2.5
2.6
2.7
2.8
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Assembly code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3
Block FIR 16-bit filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1
3.2
3.3
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Arguments and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1
3.3.2
Calling the function from C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Calling the function from Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4
3.5
3.6
3.7
3.8
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Assembly code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4
Complex 16-bit radix-4 FFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1
4.2
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2/19
UM0304
Contents
4.3
4.4
Arguments and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4.1
4.4.2
Calling the FFT function from C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Calling the FFT function from assembly . . . . . . . . . . . . . . . . . . . . . . . . 15
4.5
4.6
4.7
The FFT function characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Performance benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Fixed-point error benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5
Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3/19
Definitions and related documents
UM0304
1
1.1
Definitions and related documents
Acronyms and terminology
Table 1.
Definition of acronyms and terms
Term
ARM
ARMA
DSP
DSPLIB
FIR
FFT
IIR
LTI
MCU
STR91x
VSELP
ARM Core
Auto Regressive Moving Average
Digital Signal Processing
Digital Signal Processing Library
Finite Impulse Response
Fast Fourier Transform
Infinite Impulse Response
Linear Time-Invariant
Microcontroller Unit
STR91x family of MCUs from STMicroelectronics
Vector-Sum Excited Linear Prediction
Definition
1.2
ARM and Thumb
The Thumb set consists of 16-bit instructions that act as a compact, shorthand subset of the
32-bit ARM instructions. Every Thumb instruction could be executed via an equivalent 32-
bit ARM instruction. However, not all ARM instructions are available in the Thumb subset.
For example, there's no way to access status or coprocessor registers in Thumb. Also,
some functions that can be accomplished in a single ARM instruction can only be
accomplished with a sequence of Thumb instructions.
Thumb compatible processor can operate in ARM or Thumb state. Some method is needed
to switch the processor from executing instructions in one state to executing in the other.
This is provided by the Branch Exchange instruction, versions of which exist both in the
ARM and Thumb instruction sets. Both of these perform a branch by copying the contents
of general register Rn into the program counter causing a pipeline flush and refill from
address specified in Rn. Thus, BX is absolute rather than PC-relative.
In ARM state the format is:
BX{<cond>} Rn
In Thumb state the format is:
BX Rn
All ARM instructions are word-aligned, and all Thumb instructions are half-word aligned.
Therefore, the least significant bit in Rn can always be considered to be zero. The
4/19
UM0304
Definitions and related documents
processor can actually use this bit to determine if the instruction jumped to should be
executed in Thumb or ARM state:
●
●
If bit 0 set then execute in Thumb state
If bit 0 clear then execute in ARM state
Better code density, as the instructions are half the size of ARM instructions (although
some ARM instructions require two Thumb instructions for the same effect). You would
have to compile the application for ARM and Thumb and see what gives the best result.
Better performance from narrow memory, as instruction fetches from smaller memory
(ie: 8-bit or 16-bit) will be reduced in Thumb mode.
Thumb was defined for two main reasons:
1.
2.
1.3
References
1.
2.
Sanjit K. Mitra, "Digital Signal Processing - A Computer Based Approach", McGraw
Hill, Third Edition 2006.
R. Deka and J. G. Gardiner, "On the Fundamentals of Digital Signal Processing
Micros," Journal of Microcomputer Applications, Vol 17 No 1, pp 101-135, U K, January
1994.
E. Oran Brigham, "The Fast Fourier Transform and its Applications", ISBN 0-13-
307547-8, Prentice-Hall International Editions, 1988.
C.S. Burrus, "Unscrambling for fast DFT algorithms", IEEE Transactions on Acoustics,
Speech, and Signal Processing, ASSP-36(7), 1086-1089, July 1988.
C. S. Burrus and T.W. Parks, "DFT/FFT and Convolution Algorithms - Theory and
Implementation", J. Wiley, 1985.
3.
4.
5.
5/19