DSP Module 5 2018 Scheme
DSP Module 5 2018 Scheme
Von Neumann architecture has one common memory unit for both
program and data.
The central processing unit (CPU) fetches an instruction from memory
and decodes it to figure out what operation to do, then executes the
instruction.
Hence, fetching and execution cycles will happen in serial fashion.
The instruction has two parts: the opcode and the operand.
The opcode tells the CPU what to do.
The operand tells the CPU what data to operate on.
7
After an instruction is completed, the cycles will resume for the next
instruction.
Since the processor proceeds in a serial fashion, it causes most units to
stay in a wait state which will result in slowing down the speed of
execution.
Harvard Architecture 8
The main components of a processor designed with Von Neumann
Architecture are
Arithmetic Logic Unit Program Memory Address Bus
Program Control Unit Data Memory Address Bus
Address Generator Multiplier Accumulator (MAC)
Program Memory Unit
Data Memory Shift Unit
The Harvard architecture is preferred for all digital signal processors due
to the requirements of most DSP algorithms, such as filtering,
convolution, and FFT, which need repetitive arithmetic operations,
including multiplications, additions, memory access, and heavy data flow
through the CPU
MAC is a special hardware unit for enhancing the speed of digital filtering.
Fig. 5 shows a typical MAC unit used in DSP.
It has a pair of input registers, each holding a 16-bit input to the multiplier.
The result of the multiplication is accumulated in a 32-bit accumulator
unit.
The result register holds the double precision data from the accumulator.
16
The circular buffer acts like a first-in/first-out (FIFO) buffer, but every data
sample on the buffer need not be moved.
This will significantly enhance the processing speed.
Fig. 7 gives a simple illustration of 2-bit circular buffer.
22
3 10 = 011 2
Since the decimal number is –ve, we have to find the 2’s complement of
011 2
27
We get 100.
Discard the first two bits. We get 110 which is equal to −2.
−2 3
The most +ve number is 1 − 2 =
4
−2 1
Each interval is of size 2 =
4
36
𝟐 𝟑
Ex. 5 : Perform × − using Q-2 number system.
𝟒 𝟒
𝟐 𝟑 3
We know that × − = −
𝟒 𝟒 8
Now, lets see how we can perform this operation in Q-2 number
system.
37
Let us convert 1.1010 into decimal and check if we get the right
product.
38
But, if we discard last two bits and retain only 3 bits, then we get 1.10
Therefore,
3
The actual answer was −
8
1
But we are getting −
2
1 3 1 1 1
Error = − − − = − = which is less than
2 8 8 8 4
42
Q-15 Format
In Q-15 format, there are 15 magnitude bits and 1 sign bit.
Hence,
0.560123 = 0.100011110110010
2’s complement of
0.100011110110010 = 1011100001001110
Dropping the least significant bits and retaining only 4 bits, we get 0.001
Note : If we convert 0.001 into decimal value, we get 2−3 which is equal
to 0.125 which is the product of 0.25 and 0.5.
Floating Point Format 51
If we assign 12 bits for the mantissa and 4 bits for the exponent, the
format looks as follows.
53
The most –ve number we can represent in this format is
1.00000000000 2 ∗ 2 0111 2 = −1 ∗ 27 = −128.0
Note that this choice of scaling is not unique. We can also scale the number
by 20 or 2−1
−20.430527
We first scale the number -𝟐𝟎.𝟒𝟑𝟎𝟓𝟐𝟕 to = −0.638454
25
We know that 𝑀1 2𝐸 + 𝑀2 2𝐸 = 𝑀1 + 𝑀2 2𝐸
By cascading the exponent and mantissa parts, we get the floating point
number as 0101 101011101111
62
Ex. 15 : Multiply the following using floating point format.
Therefore, lets multiply the mantissa parts and add the exponent parts.
63
Now, we multiply two positive mantissas and truncate the result to 12 bits to
get
010100011111 ∗ 010100011011 = 001101000100
One is the IEEE single precision format, and the other is the IEEE
double precision format.
IEEE Single Precision Format (IEEE 754) 67
Four data buses and four address buses are accommodated to work with
the data memories and program memories in parallel to speed up the
operation.
The program memory address bus and program memory data bus are
responsible for fetching the program instruction
Basic Architecture of TMS320C54x Family 74
C and D data memory address buses and the C and D data memory data
buses deal with fetching data from the data memory.
E data memory address bus and the E data memory data bus are
dedicated to moving data into data memory.
The program control unit fetches instructions via the program memory
data bus.
Using floating point arithmetic offers the advantages such as getting rid
of overflows, round-off errors, truncation errors and coefficient
quantization errors.
Signal Processors
These processors are based on Harward Architecture, ie, separate
memory units are provided for program and data.
There also exist memory buses and data buses for direct-memory
access (DMA) for simulataneous I/O and CPU operations.
Features of TMS320C3x Floating-Point Digital 81
Signal Processors
ALU is capable of operating both integer and floating-point arithmetic.
Signal Processors
The CPU register file offers 28 registers, which can be operated on by
the multiplier and ALU.
Eight auxiliary registers can be used for addressing and for integer
arithmetic.
Features of TMS320C3x Floating-Point Digital 83
Signal Processors
These registers provide internal temporary storage of internal variables
instead of external memory storage, to allow performance of arithmetic
between registers.
Signal Processors
Three floating-point formats are supported.
A short 16-bit floating-point format has 4 exponent bits, 1 sign bit, and
11 mantissa bits.
A 32-bit single precision format has 8 exponent bits, 1 sign bit, and 23
fraction bits.
A 40-bit extended precision format contains 8 exponent bits, 1 sign bit,
and 31 fraction bits.
Finite Impulse Response (FIR) and Infinite 85
𝑆 = 𝐼𝑚𝑎𝑥 |ℎ 𝑘 |
𝑘=0
= 𝐼𝑚𝑎𝑥 ℎ 0 + ℎ 1 + ...
where ℎ 𝑘 is the impulse response of the filter.
Scaling down the coefficients will make them less than 1, and later the
filtered output will be scaled up by the same amount before it is sent to
Digital to Analog Converter (DAC).
88
Ex. 18 : Given the FIR filter
𝒚 𝒏 = 𝟎. 𝟗𝒙 𝒏 + 𝟑𝒙 𝒏 − 𝟏 − 𝟎. 𝟗𝒙 𝒏 − 𝟐
𝟏
with a passband gain of 4. Assume that the input range occupies only th of
𝟒
full range, develop the DSP implementation equation in Q-15 fixed point
system.
1
The scaling factor 𝑆 = ∗ 0.9 + 3 + 0.9 = 1.2
4
89
We select 𝑆 = 2 (a power of 2)
Then the new input and new impulse response respectively will be
𝑥𝑛 ℎ𝑛
𝑥𝑠 𝑛 = and ℎ𝑠 𝑛 = = 0.225, 0.75, −0.225
2 4
𝑌𝑧 2 2𝑧
Transfer function 𝐻 𝑧 = = =
𝑋𝑧 1−0.5𝑧 −1 𝑧−0.5
94
To find the impulse response, we have to find the inverse Z-transform of
𝐻𝑧
𝑧
We know that inverse Z-transform of = 𝑎𝑛 𝑢 𝑛
𝑧−𝑎
1
= 0.25 ∗ 2 ∗
1−0.5
=1