0% found this document useful (0 votes)

121 views

DSP

The document discusses digital signal processors and architectures. It covers several topics including digital signal processing systems, discrete Fourier transforms, number formats for DSP like fixed point and floating point, sources of errors in DSP implementations, and digital signal processing architectures like TMS320C54xx devices. The presentation is intended for an undergraduate course on digital signal processors and covers key concepts and learning outcomes.

Uploaded by

Mallikarjun Aralimarad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views

DSP

Uploaded by

Mallikarjun Aralimarad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 190

Presentation on

DIGITAL SIGNAL PROCESSORS AND ARCHITECTURE

(ECE)
B.TECH VIII -Semester (AUTONOMOUS-R16)

Prepared by,
Ms. C. Devisupraja
Assistant Professor
COURSE OUTCOMES

CO 1 Understand the basics of Digital Signal

Processing and transforms.
CO 2 Able to distinguish between the architectural
features of General purpose processors and
DSP processors.
CO 3 Understand the architectures of TMS320C54xx
devices.
CO 4 Discuss about various memory and parallel I/O
interfaces.
CO 5 Discuss about various memory and parallel I/O
interfaces.

2
UNIT-1
Introduction: Digital signal-processing system, discrete Fourier Transform (DFT)
and fast Fourier transform (FFT), differences between DSP and other micro
processor architectures; Number formats: Fixed point, floating point and block
floating point formats, IEEE-754 floating point, dynamic range and precision,
relation between data word size and instruction word size; Sources of error in
DSP implementations: A/D conversion errors, DSP computational errors, D/A
conversion errors, Q-notation.

3
UNIT-1 CLO’S

CLO 1 AEC507.01
Understand howUnderstand how digital
digital to analog (D/A)toand analog to
Processing.
digital (A/D) converters operate on a signal and be able
to model these operations mathematically.
CLO 2 Understand the inter-relationship between DFT and
various transforms.
CLO 3 Understand the IEE-754 floating point and source of
errors in DSP implementations .
CLO 4 Understand the fast computation of DFT and appreciate
the FFT Processing

4
DIGITAL SIGNAL PROCESSING SYSTEM
•A DSP system uses a computer or a digital processor to process signals.
• Represent signals by a sequence of numbers
– Sampling or analog-to-digital conversions

• Perform processing on these numbers with a digital processor

– Digital signal processing
• Reconstruct analog signal from processed numbers

5
DIGITAL SIGNAL PROCESSING SYSTEM

• Analog input – analog output

-Digital recording of music
• Analog input – digital output
 - Touch tone phone dialing
• Digital input – analog output
 -Text to speech

• Digital input – digital output

-Compression of a file on computer

6
ANALOG, DIGITAL, MIXED SIGNAL PROCESSING

7
DIGITAL SIGNAL PROCESSING

8
9
SAMPLE AND HOLD CIRCUIT
• The main function of low pass ant aliasing filter is to band limit the input
signal to the folding frequency without distortion.

• It should be noted that even if the signal is band limited, there is always
wide-band additive noise which will be folded back to create aliasing.

• When an analog voltage is connected directly to an ADC, the conversion

process can be adversely affected if the voltage is changing during the
conversion time.

• The quality of conversion process can be improved by using sample and hold
 circuit

10
SAMPLE AND HOLD CIRCUIT

11
12
DIGITAL CONVERSION

13
RECONSTRUCTION

14
RECONSTRUCTION

15
RECONSTRUCTION

16
RECONSTRUCTION

17
DISCRETE FOURIER TRANSFORM

• DFT is used for analyzing discrete-time finite duration signals

in the frequency domain .

18
DISCRETE FOURIER TRANSFORM

• The Discrete Fourier Transform (DFT) is one of the most important tools in
digital signal processing that calculates the spectrum of a finite-duration.
• The representation of a digital signal in terms of its frequency component in a
frequency domain is important. The algorithm that transforms the time domain
signals to the frequency domain components is known as DFT.
•No of additions to compute DFT is N(N-1)
•No of Multiplications to compute DFT is N2.

19
FAST FOURIER TRANSFORM

• The Fast Fourier Transform (FFT) is an implementation of the DFT which

produces almost the same results as the DFT, but it is incredibly more
efficient and much faster which often reduces the computation time
significantly.
• It is just a computational algorithm used for fast and efficient
computation of the DFT. Various fast DFT computation techniques known
collectively as the fast Fourier transform, or FFT.

20
DFT vs FFT

21
NUMBER FORMATS

 In DSP, signals are represented as discrete sets of numbers from the

input stage along through intermediate processing stages to the
output.
 Even DSP structures such as filters require numbers to specify
coefficients for operation.
 Two typical formats for these numbers:
• fixed-point format
• floating-point format

22
FIXED-POINT FORMAT

 Simplest scheme
 Number is represented as an integer or fraction using a fixed number of
bits.
 An n-bit fixed-point signed integer −2n−1 ≤ x ≤ 2n−1 − 1is represented as:
x = −s · 2n−1 + bn−2 · 2n−2 + bn−3 · 2n−3 + · · · + b1 · 21 + b0 · 2

where s represents the sign of the number (s = 0 for positive and s = 1 for
negative)

23
FIXED-POINT INTEGER FORMAT
• the simplest scheme of number representation is the format in which
the number is represented as an integer or fraction using a fixed no of
bits.

• An n-bit fixed point signed integer specifies the value x given as

x = −s · 2n-1+ b n−2 · 2n−2 + b n−3 · 2n−3 + · · · + b · 21 + b · 20

Eq-1

Where s represents the sign of the numbers: s=0 for positive

numbers and s=-1 for negative numbers
• The range of signed integer values that can be represented with
this format is -2n-1 to +(2n-1-1).
s bn-2 b2 b1 b0

Implied binary point 24

FIXED-POINT FRACTIONAL FORMAT

• Similarly, a fraction can also be represented using a fixed no of bits

with an implied binary point after the most significant bit. An bit
fixed point signed fraction representation.
x = −s · 20+ b −1 · 2−1 + b -2· 2−2 + · · · + b-(n-2) · 2-(n-2) + b-(n-1) · 2-(n-1)

Eq-2
s b-1 b-(n-3) B-(n-2) B-(n-1)

Implied binary point

• The range of signed integer values that can be represented with
this format is --1 to +(1-2-(n-1)).

25
EXAMPLE 1

What is the range of numbers that can be represented in a fixed-point format

using 16 bits if the numbers are treated as
a) Signed Integers
b) Signed Fractions
Sol: a) using 16 bits the range of integers that can be represented is
determined by substituting n=16 in eq-1 and is given as -215 to + 215 -1 i.e
-32,768 to + 32,767.
b) the range of fractions, that can be represented is determined by substituting
n=16 in eq-2 and is given as -1 to + (1-2-15 ) i.e -1 to + .999969482.

26
DOUBLE -PRECISION FIXED -POINT FORMAT

• To increase the range of numbers that can be represented in fixed point

format, one obvious approach is to increase its size .

• If the size is doubled , the range of numbers increases substantially,

Simply doubling the size and still using the fixed –point format creates is
known as double- precision fixed –point format.

• It requires double the storage for the storage for the same data and may
need double the number of accesses for the same size of data bus of the
DSP device.

27
IEEE 754 FLOATING –POINT FORMAT

• Floating-point DSPs represent and manipulate rational numbers via a

minimum of 32 bits in a manner similar to scientific notation, where a
number is represented with a mantissa and an exponent (e.g., A x 2B,
where 'A' is the mantissa and ‘B’ is the exponent), yielding up to
4,294,967,296 possible bit patterns (232).

28
IEEE 754 EXAMPLE

29
DYNAMIC RANGE AND PRECISION

• Dynamic range of a signal is the ratio of the maximum value to the

minimum value that the signal can take in the given number
representation scheme .
• The Dynamic range of a signal is proportional to the muber of bits used to
represent it and increases by 6 dB for every additional bit used for the
representation .
Dynamic range = 20log10(2n)= 6.02n dB
• Resolution is defined as the minimum value that can be represented
using a number representation format . For instance, if N bits are used to
represent a number from 0 to 1 , the smallest value it can take is the
resolution and is given as
Resolution = 1/2N for large N
30
DYNAMIC RANGE AND PRECISION CONT…

• Resolution of a number representation format is normally expressed as

number of bits used in the representation .At times it is also expressed in
percentage .

• Precision is an issue related to the speed of DSP implementation. In

general, techniques to improve the precision of an implementation reduce
its speed.

• Larger word size improves the precision but may pose a problem with the
speed of the processor, especially if its bus width is limited.

• For example if the 32-bit product of a 16X16 multiplication has to be

preserved without loss of precision , two memory access are required to
store and recall this product using a 16-bit bus.

31
DYNAMIC RANGE AND PRECISION CONT…

• When floating point number representation is used , the exponent

determines the dynamic range. Since the exponent in the floating
point representation is a power, the dynamic range of a floating point
number is very large.

• The resolution or precision of a floating point number is determined

by its mantissa.

• since the mantissa uses fewer bits compared to fixed- point

representation, the precision of a floating point number
representation is smaller than a comparable fixed point
representation.

32
EXAMPLE 1

Calculate the dynamic range and precision of each of the following number
representation formats
a) 24-bit , single –precision fixed point format
b) 48-bit , double –precision fixed point format
c) A floating point format with a 16-bit mantissa and an 8-bit exponent .
Sol: a) since each bit gives a dynamic range of 6 dB, the total dynamic range
is 24X6=144 dB, Percentage resolution is (1/224) X 100 = 6 X 10-6
b) since each bit gives a dynamic range of 6 dB, the total dynamic range
is 48X6=288 dB, Percentage resolution is (1/248) X 100 = 4 X 10—13
c) For floating-point representation , the dynamic range is determined
by the no of bits in the exponent. Since there are 8 exponent bits, the
dynamic range is (28-1)x6= 255X 6= 1530 dB.
33
EXAMPLE 1 CONT…

• The percentage resolution depends on the number of bits in the mantissa.

• Since there are 16- bits in the mantissa, the resolution is
(1/216) X 100 = 1.5 X 10—3

Format of Number of Dynamic Precision

representation Bits Used Range
Fixed-Point 24 bits 144dB 6X10-6
Double-Precision 48 bits 288dB 4X10-13
Floating -Point 24 bits(16-bit 1530dB 1.5X10-3
mantissa and
8-bit exponent

34
SOURCES OF ERRORS IN DSP IMPLEMENTATIONS
• The error in the A/D and D/A in the representation of analog signals by a limited
number of bits is called the quantization error.
• The quantization error decreases with the increase in the number of bits used
to represent signals in A/D and D/A converters.
• The errors in the DSP calculations are due to limited word length used. These
errors depend upon how the algorithm is implemented in a given DSP
architecture.
• This error can be reduced by using a larger word length for data and by using
rounding , instead of truncation , in calculations.
• Three types of errors
1. A/D conversion Errors
2. DSP computational Errors
3. D/A conversion Errors 35
UNIT-2
Multiplier and multiplier accumulator, modified bus structures and
memory access in PDSPs, multiple access memory, multiport
memory, SIMD, VLIW architectures, pipelining, special addressing
modes in PDSPs, on-chip peripherals.

36
UNIT-2 CLO’S

CLO 5 Understand the concept of multiplier and multiplier

Accumulator.
CLO 6 Design SMID ,VLIW architectures.
CLO 7 Understand the modified bus structures and memory
access in PDSPs.
CLO 8 Understand the special addressing modes in PDSPs

37
MAC UNIT

38
MAC UNIT Cont..

39
MAC in Von Neumann Architecture

40
MAC UNIT Cont...

41
MAC UNIT Cont..

42
MAC UNIT

43
MAC IN VON NEUMANN ARCHITECTURE

44
MAC UNIT CONT...

45
SHIFTERS

Shifters are used to either scale down or scale up operands or the results. The
following scenarios give the necessity of a shifter
a. While performing the addition of N numbers each of n bits long, the sum can
grow up to n+log2 N bits long. If the accumulator is of n bits long, then an
overflow error will occur.
b. This can be overcome by using a shifter to scale down the operand by an
amount of log2N.
c. Similarly while calculating the product of two n bit numbers, the product can
grow up to 2n bits long.
d. Generally the lower n bits get neglected and the sign bit is shifted to save the
sign of the product.

46
SHIFTERS CONT..

e. Finally in case of addition of two floating-point numbers, one of the operands

has to be shifted appropriately to make the exponents of two numbers equal.
From the above cases it is clear that, a shifter is required in the architecture
of a DSP

47
BARREL SHITERS

• In conventional microprocessors, normal shift registers are used for shift

operation. As it requires one clock cycle for each shift, it is not desirable
for DSP applications, which generally involves more shifts.
• In other words, for DSP applications as speed is the crucial issue, several
shifts are to be accomplished in a single execution cycle. This can be
accomplished using a barrel shifter, which connects the input lines
representing a word to a group of output lines with the required shifts
determined by its control inputs.
• For an input of length n, log2 n control lines are required. And an
Additional control line is required to indicate the direction of the shift

48
BARREL SHITERS Cont..

The block diagram of a typical barrel shifter is as shown in figure

49
BARREL SHITERS CONT..

50
MAC UNIT

51
PROCESSOR ARCHITECTURES

SIMD – Single Instruction Multiple Data

52
Processor Architectures Cont..

MIMD – Multiple Instruction Multiple Data

53
Processor Architectures Cont..

 VLIW – Very Long Instruction Words

5
VLIW ARCHITECTURE

Functional units
can be split into
submodules, e.g.
for images (8bits)
TI320C80,
1 RISC
4 x 32bit DSP which
can be split into 8bit
modules
19

5
Low Power MMAC Multiplier Multiple Accumulator

 By using anti-aliasing filters. 3

VLIW: GENERAL CONCEPT

57
Basic structure of VLIW Architecture

58
VLIW characteristics

59
VLIW EXAMPLE: TMS320C62

60
VLIW EVALUATION

61
MULTIPLE ACCESS MEMORY

• The number of memory accesses/clock period can also be increased by

using a high speed memory than one memory accesses/clock period.
• The concept of DARAM
• Dual-access RAM blocks can be accessed twice per machine cycle. This
memory is intended primarily to store data values; however, it can be used
to store program as well.
• At reset, the DARAM is mapped into data memory space

62
MULTIPORTED MEMORY

63
PIPELINING

• One of the approach for increasing the efficiency of P-DSPs and

Advanced Microprocessors.
• An instruction cycle starting with the fetching of an instruction & ending
with the execution of the instruction including the time storage of the
results can be split into a number of microinstructions.
• Pipelining is a technique where multiple instructions are overlapped
during execution. Pipeline is divided into stages and these stages are
connected with one another to form a pipe like structure. Instructions
enter from one end and exit from another end. Pipelining increases the
overall instruction throughput
• Different stages of pipelining are Instruction fetch ,Instruction decode,
register fetch ,Execute ,Memory access.

64
PIPELINING

PIPELINING

65
SPECIAL ADDRESSING MODES IN P-DSPS

1) Short Immediate Addressing

2) Short direct Addressing

3) Memory-mapped Addressing

4) Indirect Addressing

5) Bit Reversed Addressing Mode

6) Circular Addressing

66
SPECIAL ADDRESSING MODES IN P-DSPS

1) Short Immediate Addressing

• Permits the operand to be specified using a short constant that forms
part of a single word instruction.
• The length of the short constant depends on he instruction type & P-DSP.
• Short immediate values can be 3, 5, 8, or 9 bits in length. Mode

2) Short direct Addressing

• Permits the lower order address of the operand of an instruction to be
specified in the single word instruction.
• In TI TMS320 DSPs, the higher order 9 bits of the memory are stored
in the data page pointer & only the lower 7 bits are specified as a part
of the instruction.

67
SPECIAL ADDRESSING MODES IN P-DSPS

Generation of Data Addresses in Direct Addressing Mode

68
3) Memory-mapped Addressing
• The CPU registers & I/O registers of P-DSPs are also accessible as memory
location.

• This is achieved by storing them in either the starting page or the final page of
the memory space.

• For Eg. In TMS320C5X, page 0 corresponds to CPU registers & I/O registers.

• When these registers are accessed using memory mapped addressing modes,
the higher address bits are not taken from the data page pointer & instead
made to be 0 in case of TI DSPs & 1 in Motorola DSPs..

69
4) Indirect Addressing
• In indirect addressing, any location in the 64K-word data space can be
accessed using the 16-bit address contained in an auxiliary register.
• The C54x DSP has eight 16-bit auxiliary registers (AR0–AR7).

• Indirect addressing is used mainly when there is a need to step through

sequential locations in memory in fixed-size steps.

• In TI, offset register is called as INDX register.

• In Analog devices, called as Modifier Register.

• Contents can also be updated by a constant using Bit Reversed

Addressing Mode.

• Pre-increment / decrement & Post-increment / decrement.

70
BIT REVERSED ADDRESSING MODE

• The binary pattern corresponding to a particular decimal number is

obtained by writing the natural binary equivalent of the number in
the reverse order so that the MSB of the natural binary becomes
the LSB of the bit reversed number & vice-versa.

71
BIT REVERSED ADDRESSING MODE

72
CIRCULAR ADDRESSING

• Memory can be organized as a circular buffer with the beginning memory

address & the ending memory address corresponding to this buffer
defined by the programmer.

• In this, when the address pointer is incremented, the address will be

checked with the ending memory address of the circular buffer.

• If it exceeds that, the address will be made equal to the beginning address
of the circular buffer

73
ON CHIP PERIPHERALS

•Clock Generator
•Hardware Timer
•Software-Programmable Wait-State Generators
• Parallel I/O Ports
•Host Port Interface (HPI)
•Serial Port
• TDM Serial Port
•Buffered Serial Port
•User Maskable Interrupts

74
UNIT-3
Architecture of TMS320C54XX DSPs, addressing modes,
memory space of TMS320C54XX processors. Program control,
instruction set and programming, on-chip peripherals,
interrupts of TMS320C54XX processors, pipeline operation.

75
UNIT-3 CLO’S

CLO 9 Understand the architecture of TMS320C54XX DSPs.

CLO 10 Understand the addressing modes and memory space

of TMS320C54XX DSPs.
CLO 11 Understand the various interrupts and pipeline
operation of TMS320C54XX processors.
CLO 12 Understand the special addressing modes in PDSPs
CLO 13 Understand the concept of on-chip Peripherals

76
Architecture of TMS320C54XX

• Fixed Point Processor

• Advanced Harvard Architecture ,CICS processor

Separate memory bus structures for program and data

• High Degree of parallelism

Multiply ,load/store , add/sub to/from ACC and new address generation can
be done simultaneously

• Powerful Instruction set & most of operations are of single cycle

• Targeted for portable devices (cellular phones,MP3 players , digital cameras)

77
INTRODUCTION
This unit provides the architectural overview of TMS320C54XX which comprises
of :-

• CPU
• On Chip Memory
• On Chip Peripherals
• Addressing Modes
• Interrupts
• Program Control
• Internal Memory Bus Organization
• Buses
• Pipelining
78
INTRODUCTION

• The C54XX DSPuses modified harvard architecture that maximizes

processing power eight buses.

• Separate Program & Data buses allow simultaneous access to program & data
providing high degree of parallelism.
• Data can be transferred between program & data memory.

79
EFFICIENT DATA/PROGRAM FLOW

#1: CPU designed for efficient DSP processing

• MAC unit, 2 Accumulators, AdditionalAdder,

Barrel Shifter

#2: Multiple busses for efficient data and program flow

• Four busses and large on-chip memory that result in sustained

performance near peak

#3: Highly tuned instruction set for powerful DSP computing

• Sophisticated instructions that execute in fewer cycles, with less code

and low power demands

80
TMS320C54X INTERNAL BLOCK DIAGRAM

81
CENTRAL PROCESSING UNIT (CPU)

The ’54xx CPU contains the following,

•40-bit Arithmetic Logic Unit (ALU)
•40-bit Accumulators (A and B)
•Barrel Shifter
•17 x 17-bit Multiplier
•40-bit Adder
•Compare, Select And Store Unit (CSSU)
•Exponent Encoder(exp)
• Data Address Generation Unit (DAGEN) and
•Program Address Generation Unit (PAGEN).

82
FUNCTIONAL DIAGRAM OF CPU OF TMS320C54xx

83
ALU

• The ALU performs 2’s complement arithmetic operations and bit-level

Boolean operations on 16, 32, and 40-bit words.

• ACCUMULATORS A AND B:

• Accumulators A and B store the output from the ALU or the multiplier/adder
block and provide a second input to the ALU.

• Each accumulators is divided into three parts:

• Guards Bits (Bits 39-32) ,High-order Word (Bits-31-16),Low-order Word (Bits

15-0)

• Each accumulator is memory-mapped and partitioned.

• It can be configured as the destination registers.

84
BARREL SHIFTER

• Barrel shifter provides the capability to scale the data during an operand read
or write.

• No overhead is required to implement the shift needed for the scaling

operations.

• The’54xx barrel shifter can produce a left shift of 0 to 31 bits or a right shift of
0 to 16bits on the input data.

• The shift count field of status registers ST1, or in the temporary register T.

• The barrel shifter and the exponent encoder normalize the values in an
accumulator in a single cycle.

• The LSBs of the output are filled with0s, and the MSBs can be either zero
filled or sign extended, depending on the state of the sign-extension mode
bit in the status register ST1. 85
FUNCTIONAL DIAGRAM OF BARREL SHIFTER OF
TMS320C54xx

86
MULTIPLIER/ADDER UNIT

• The multiplier/adder unit of TMS320C54xx devices performs 17 x 17 2’s

complement multiplication with a 40-bit addition effectively in a single
instruction cycle.

• In addition to the multiplier and adder, the unit consists of control logic for
integer and fractional computations and a 16-bit temporary storage register,T.

• The compare, select, and store unit (CSSU) is a hardware unit specifically
incorporated to accelerate the add/compare/select operation.

• The exponent encoder unit supports the EXP instructions, which stores in the
T register the number of leading redundant bits of the accumulator content.

• This information is useful while shifting the accumulator content for the
purpose of scaling.
87
FUNCTIONAL DIAGRAM OF MULTIPLIER/ADDER OF
TMS320C54xx

88
BUSES IN C54XX

• The C54XX architecture is built around 8 major 16 bit buses.

• The Program Bus carries the instruction code & immediate operands
from program memory.
• Three data buses (CB,DB,EB) interconnect to various elements such as CPU,
Data address generation logic ,on chip Peripherals & data memory.
• The CB & DB carry the data operands that are read from memory.
• The EB carries the data to be written to memory.
• Four address buses (PAB, CAB, DAB, and EAB) carry the addresses
needed for instruction execution.

89
BUSES USAGE

90
BUSES

• All CPU registers ,peripheral registers and I/O ports

occupy data memory space

91
BUSES

92
BUSES

• The C54x DSP can generate up to two data-memory addresses per cycle using
the two auxiliary register arithmetic units (ARAU0 and ARAU1).
• The PB can carry data operands stored in program space to the multiplier and
adder for multiply/accumulate operations or to a destination in data space
for data move instructions.

• The C54x DSP also has an on-chip bidirectional bus for accessing on-chip
peripherals. This bus is connected to DB and EB through the bus exchanger in
the CPU interface

93
INTERNAL MEMORY ORGANIZATION

 Minimum address range of 192k words

64K words for program space

64K words for data space

64K words for I/O space

• ROM , DARAM ,SARAM , two array shared RAM

• On-chip Memory Security option

• MMR : 26 CPU regs ,peripheral regs and scratch pad RAM block located on
data page 0(DP0)

• The C54XX DSP memory is organized into three individually selectable

spaces: program, data, and I/O space.
94
INTERNAL MEMORY ORGANIZATION
• The C54x devices can contain random access memory (RAM)
and read-only memory (ROM).

• Among the devices, the following types of RAM are represented: dual-access
RAM (DARAM), single-access RAM (SARAM), and two-way shared RAM.

• The DARAM or SARAM can be shared within subsystems of a multiple-CPU

core device.

• We can configure the DARAM and SARAM as data memory

• or program/data memory.

• The C54x DSP also has 26 CPU registers plus peripheral registers that are
mapped in data-memory space.

95
INTERNAL MEMORY AND MEMORY-MAPPED
REGISTERS
• The amount and the types of memory of a processor have direct relevance to
the efficiency and performance obtainable in implementations with the
processors.

• The ‘54xx memory is organized into three individually selectable spaces:

program, data, and I/O spaces.

• All ‘54xx devices contain both RAM and ROM. RAM can be either dual-access
type (DARAM) or single-access type (SARAM).

• The ‘54xx processors have a number of CPU registers to support operand

addressing and computations.

• The processors mode status (PMST) registers is used to configure the

processor. It is a memory-mapped register located at address 1Dh on page 0
of the RAM.
96
INTERNAL MEMORY-MAPPED REGISTERS OF
TMS320C54XX PROCESSORS.

97
PERIPHERAL REGISTERS FOR THE TMS320C54XX
PROCESSORS

98
DATA ADDRESSING MODES OF TMS320C54X
PROCESSORS:

• Data addressing modes provide various ways to access operands to execute

instructions and place results in the memory or the registers.
• The 54XX devices offer seven basic addressing modes
• 1. Immediate addressing.
• 2. Absolute addressing.
• 3. Accumulator addressing.
• 4. Direct addressing.
• 5. Indirect addressing.
• 6. Memory mapped addressing
• 7. Stack addressing.

99
BLOCK DIAGRAM OF THE DIRECT ADDRESSING
MODE FOR TMS320C54XX PROCESSORS

100
BLOCK DIAGRAM OF THE INDIRECT ADDRESSING
MODE FOR TMS320C54XX PROCESSORS.

101
INDIRECT ADDRESSING OPTIONS WITH A SINGLE
DATA –MEMORY OPERAND

102
CIRCULAR ADDRESSING

• Used in convolution, correlation and FIR filters.

• A circular buffer is a sliding window contains most recent data. Circular buffer
of size R must start on a N-bit boundary, where 2N > R .

• The circular buffer size register (BK): specifies the size of circular buffer.
Effective base address (EFB): By zeroing the N LSBs of a user selected
AR(ARx).

• End of buffer address (EOB) : By repalcing the N LSBs of ARx with the N

LSBs of BK.

103
BLOCK DIAGRAM OF THE CIRCULAR ADDRESSING
MODE FOR TMS320C54XX PROCESSORS

104
MEMORY-MAPPED REGISTER ADDRESSING

• Used to modify the memory-mapped registers without affecting the current

data page pointer (DP) or stack-pointer (SP)

• Overhead for writing to a register is minimal

• Works for direct and indirect addressing

• Scratch –pad RAM located on data PAGE0 can be modified

105
16 BIT MEMORY MAPPED REGISTER ADDRESS
GENERATION

106
STACK ADDRESSING

• Used to automatically store the program counter during interrupts

and subroutines.
• Can be used to store additional items of context or to pass data
values.
• Uses a 16-bit memory-mapped register, the stack pointer (SP).
• PSHD X2
Values of stack &SP before and after operation

107
TMS320C54XX INSTRUCTIONS AND
PROGRAMMING

• Load and Store Operations

• Arithmetic Operations
• Logical Operations
• Program-Control Operations
• Multiply Instruction
• Multiply and Accumulate Instruction
• Multiply and Subtract Instruction
• Multiply ,Accumulate , and Delay Instruction
• Repeat Instruction

108
MEMORY SPACE OF TMS320C54XX PROCESSORS

• A total of 128k words extendable up to 8192k words.

• Total memory includes RAM, ROM, EPROM, EEPROM or Memory mapped

peripherals.

• Data memory: To store data required to run programs & for external memory

mapped registers.

109
PROGRAM MEMORY

• To store program instructions &tables used in the execution of programs.

• Organized into 128 pages, each of 64k word size

110
FUNCTION OF DIFFERENT PIN PMST REGISTER

111
MEMORY MAP FOR THE TMS320C5416 PROCESSOR

112
PROGRAM CONTROL

• It contains program counter (PC), the program counter related H/W, hard
stack, repeat counters &status registers.

• PC addresses memory in several ways namely:

• Branch: The PC is loaded with the immediate value following the

• branch instruction

• Subroutine call: The PC is loaded with the immediate value following the call
instruction

• Interrupt: The PC is loaded with the address of the appropriate interrupt

vector.

• Instructions such as BACC, CALA, etc ;The PC is loaded with the contents of
the accumulator low word
113
INTERRUPTS

• Many times when the CPU is in the midst of executing a program a peripheral
device may require a service from CPU.

• In such a situation main program may be interrupted by signal generated by

peripheral devices.
• This results in processor suspending the main program in order to execute
another program called Interrupt Service Routine to service the peripheral.

• On completion of ISR the processor returns to the main program to continue

from where it left.
• Interrupt may be generated by internal or external device.

114
INTERRUPTS

• It may also generated by software.

• Not all the interrupts are serviced by when they occur only those interrupts
that are called non maskable are serviced when they occur.
• Other Interrupts which are called maskable interrupts are serviced only if
they are enabled.

• There is also a priority to determine which interrupts gets serviced first if

more than one interrupts occur simultaneously.

115
PIPELINE OPERATION of TMS320C54XX

• The C54xx DSP has a six-level deep instruction pipeline.

• The six stages of the pipeline are independent of each other,

• which allows overlapping execution of instructions.

• During any given cycle, from one to six different instructions can be active,
each at a different stage of completion.

116
PIPELINE OPERATION OF TMS320C54XX

• The six levels and functions of the pipeline structure are:

• Program address bus (PAB) is loaded

Program Prefetch with the address of the next
instruction to be fetched.

An instruction word is fetched from the

Program fetch program bus (PB) and loaded into the
instruction register (IR). This completes
an instruction fetch sequence that
consists of this and the previous cycle.

117
PIPELINE OPERATION OF TMS320C54XX
• The contents of the instruction
Decode register (IR) are decoded to
determine the type of memory
access operation and the control
sequence at the data-address
generation unit (DAGEN) and the
CPU.

• DAGEN outputs the read operand’s

address on the data address bus, DAB.
Access If a second operand is required, the
other data address bus, CAB, is also
loaded with an appropriate address.
Auxiliary registers in addressing mode
and the stack pointer (SP) are
also updated
118
PIPELINE OPERATION OF TMS320C54XX
• The read data operand(s), if any, are
read from the data buses, DB and CB.
 Read This completes the two-stage operand
read sequence. At the same time, the
two-stage operand write sequence
begins. The data address of the write
operand, if any, is loaded into the data
write address bus (EAB).

• The operand write sequence is

Execute
completed by writing the data
using the data write bus (EB).
The instruction is executed in this
phase

119
PIPELINING STAGES

120
.

121
ON CHIP PERIPHERALS

General-purpose I/O pins: XF and BIO Timer

Host port interface

(HPI) Synchronous

serial port Buffered

serial port (BSP)

Multichannel buffered serial port (McBSP)

Time-division multiplexed (TDM) serial port

Software-programmable wait-state generator

Programmable bank-switching module

122
GENERAL-PURPOSE I/O

• The C54xx DSP offers general-purpose I/O through two dedicated pins that
are software controlled. The two dedicated pins are the branch control input
pin (BIO) and the external flag output pin (XF).
• BIO can be used to monitor the status of peripheral devices.

• XF can be used to signal external devices. The XF pin is controlled using

software.

• It is driven high by setting the XF bit (in ST1) and is driven low by clearing the
XF bit. The set status register bit (SSBX) and reset status register bit (RSBX)
instructions can be used to set and clear XF, respectively.

123
SOFTWARE PROGRAMMABLE WAIT STATE GENERATOR

• Software Programmable wait state generator extends external bus cycle up to

seven machine cycles to interface with slower off chip memory & devices./

• The Software wait state generator is incorporated without any external

hardware.

• For off chip memory access from zero to seven wait states can be specified
within the software wait state register.

124
HOST PORT INTERFACE

• The host port interface is an 8 bit parallel port that provides an interface with
host processor.

• Information is exchanged between C54xx & host processor the C54xx on chip
memory that is accessible to both C54xx & host processor.

125
HARDWARE TIMER

• The on-chip timer is a software-programmable timer that consists of three

registers and can be used to periodically generate interrupts.

• The timer resolution is the CPU clock rate of the processor.

• The high dynamic range of the timer is achieved with a 16-bit counter with a
4-bit prescaler.

• Timer Registers:-

The on-chip timer consists of three memory-mapped registers (TIM, PRD,

and TCR).

126
TIMER REGISTERS

• Timer register (TIM):The 16-bit memory-mapped timer register (TIM)

is loaded with the period register (PRD) value and decremented.

• Timer period register (PRD): The 16-bit memory-mapped timer period register
(PRD) is used to reload the timer register (TIM).

• Timer control register (TCR):The 16-bit memory-mapped timer control register

(TCR) contains the control and status bits of the timer.
127
UNIT-4
Memory space organization, external bus interfacing signals, memory
interface, parallel I/O interface, programmed I/O, interrupts and I/O, direct
memory access (DMA).

128
UNIT-4 CLO’S

CLO 14 Understand the significance of memory space

organization.
CLO 15 Analyze external bus interfacing signals.
CLO 16 Explain about parallel I/O interface, programmed I/O.

CLO 17 Understand the significance of Interrupts and Direct

Memory Access.

129
INTRODUCTION
 A typical DSP system has DSP with external memory, input devices and
output devices.

 Since the manufacturers of memory and I/O devices are not same as that of
manufacturers of DSP and also since there are variety of memory and I/O
devices available, the signals generated by DSP may not suit memory and I/O
devices to be connected to DSP.

 Thus, there is a need for interfacing devices the purpose of it being to use
DSP signals to generate the appropriate signals for setting up communication
with the memory.

130
INTRODUCTION

131
MEMORY SPACE ORGANIZATION

• Memory Space in TMS320C54xx has 192K words of 16 bits each. Memory is

divided into Program Memory, Data Memory and I/O Space, each are of 64K
words. The actual memory and type of memory depends on particular DSP
device of the family. If the memory available on a DSP is not sufficient for an
application, it can be interfaced to an external memory.

• The On- Chip Memory are faster than External Memory. There are no
interfacing requirements. Because they are on-chip, power consumption is
less and size is small. It exhibits better performance by DSP because of better
data flow within pipeline.

132
MEMORY SPACE ORGANIZATION

• The purpose of such memory is to hold Program / Code / Instructions, to hold

constant data such as filter coefficients / filter order, also to hold
trigonometric tables / kernels of transforms employed in an algorithm

133
EFFICIENT DATA/PROGRAM FLOW

•External memory is off-chip. They are slower memory. External Interfacing is

required to establish the communication between the memory and the DSP. They
can be with large memory space. The purpose is being to store variable data and
as scratch pad memory. Program memory can be ROM, Dual Access RAM
(DARAM), Single Access RAM (SARAM), or a combination of all these.
•The program memory can be extended externally to 8192K words. That is, 128
pages of 64K words each. The arrangement of memory and DSP in the case of
Single Access RAM (SARAM) and Dual Access RAM (DARAM) One set of address
bus and data bus is available in the case of SARAM and two sets of address bus
and data bus is available in the case of DARAM. The DSP can thus access two
memory locations simultaneously.

134
EFFICIENT DATA/PROGRAM FLOW

135
DATA MEMORY

• There are 3 bits available in memory mapped register, PMST for the purpose
of on-chip memory mapping.

• They are microprocessor / microcomputer mode. If this bit is 0, the on-chip

ROM is enabled and addressable and if this bit is 1 the on-chip ROM not
available. The bit can be manipulated by software / set to the value on this
pin at system reset.

• Second bit is OVLY. It implies RAM Overlay. It enables on-chip DARAM data
memory blocks to be mapped into program space. If this bit is 0, on-chip
RAM is addressable in data space but not in Program Space and if it is 1, on-
chip RAM is mapped into Program & Data Space.

136
DATA MEMORY

The third bit is DROM. It enables on-chip DARAM 4-7 to be mapped into
data space. If this bit is 0, on-chip DARAM 4-7 is not mapped into data
space and if this bit is 1, on-chip DARAM 4-7 is mapped into Data Space.
On-chip data memory is partitioned into several regions as shown in table
7.1. Data memory can be on chip / off-chip.

137
EXTERNAL BUS INTERFACING SIGNALS
• In DSP there are 16 external bus interfacing signals. The signal is characterized as
single bit i.e., single line or multiple bits i.e., Multiple lines / bus. It can be
synchronous / asynchronous with clock.

• The signal can be active low / active high. It can be output / input Signal. The
signal carrying line / lines Can be unidirectional / bidirectional Signal. The
characteristics of the signal depend on the purpose it serves

• In external bus interfacing signals, address bus and data bus are multi-lines bus.
Address bus is unidirectional and carries address of the location refereed. Data
bus is bidirectional and carries data to or from DSP. When data lines are not in
use, they are tri-stated.

138
EXTERNAL BUS INTERFACING SIGNALS

• Read/Write Signal is low when DSP is writing and high when DSP is reading.
Strobe Interfacing Signals, Memory Strobe and I/O Strobe both are active
low. They remain low during the entire read & write operations of memory
and I/O operations respectively.

• External Bus Interfacing Signals from 1-8 are all are unidirectional except Data
Bus which is bidirectional. Address Lines are outgoing signals and all other
control signals are also outgoing signals.

• Data Ready signal is used when a slow device is to be interfaced. Hold

Request and Hold Acknowledge are used in conjunction with DMA controller.

139
EXTERNAL BUS INTERFACING SIGNALS

• There are two Interrupt related signals: Interrupt Request and Interrupt
Acknowledge. Both are active low. Interrupt Request typically for data
exchange. For example, between ADC / another Processor.

• TMS320C5416 has 14 hardware interrupts for the purpose of User interrupt,

Mc-BSP, DMA and timer. The External Flag is active high, asynchronous and
outgoing control signal. It initiates an action or informs about the completion
of a transaction to the peripheral device.

• Branch Control Input is a active low, asynchronous, incoming control signal. A

low on this signal makes the DSP to respond or attend to the peripheral
device. It informs about the completion of a transaction to the DSP.

140
EXTERNAL BUS INTERFACING SIGNALS

141
MEMORY INTERFACE

• Typical signals in a memory device are address bus to carry address of

referred memory location. Data bus carries data to or from referred memory
location. Chip Select Signal selects one or more memory ICs among many
memory ICs in the system. Write Enable enables writing of data available on
data bus to a memory location. Output Enable signal enables the availability
of data from a memory location onto the data bus.

• The address bus is unidirectional, carries address into the memory IC bus is
bidirectional. Chip Select, Write Enable and Output Enable control signals are
active high or low and they carry signals into the memory ICs. The task of the
memory interface is to use DSP signals and generate the appropriate signals
for setting up communication with the memory.
142
MEMORY INTERFACE

143
TIMING SEQUENCE FOR EXTERNAL MEMORY
ACCESS

• The timing sequence of memory access is shown. There are two read
operations, both referring to program memory. Read Signal is high and
Program Memory Select is low. There is one Write operation referring to
external data memory. Data Memory Select is low and Write Signal low. Read
and write are to memory device and hence memory strobe is low. Internal
program memory reads take one clock cycle and External data memory
access require two clock cycles.

144
TIMING SEQUENCE FOR EXTERNAL MEMORY ACCESS

145
PARALLEL I/O INTERFACE

• I/O devices are interfaced to DSP using unconditional I/O mode, programmed
I/O mode or interrupt I/O mode. Unconditional I/O does not require any
handshaking signals. DSP assumes the readiness of the I/O and transfers the
data with its own speed. Programmed I/O requires handshaking signals.

• DSP waits for the readiness of the I/O readiness signal which is one of the
handshaking signals. After the completion of transaction DSP conveys the
same to the I/O through another handshaking signal. Interrupt I/O also
requires handshaking signals.

• DSP is interrupted by the I/O indicating the readiness of the I/O. DSP
acknowledges the interrupt, attends to the interrupt. Thus, DSP need not
wait for the I/O to respond. It can engage itself in execution as long as there
is no interrupt. 146
PROGRAMMED I /O INTERFACE

•The timing diagram in the case of programmed I/O is shown in figure. I/O
strobe and I/O space select are issued by the DSP. Two clock cycles each are
required for I/O read and I/O write operations.

147
PROGRAMMED I /O FLOWCHART

148
INTERRUPT I/O

 This mode of interfacing I/O devices also requires handshaking signals. DSP is
interrupted by the I/O whenever it is ready. DSP Acknowledges the interrupt,
after testing certain conditions, attends to the interrupt. DSP need not wait
for the I/O to respond. It can engage itself in execution.

 There are a variety of interrupts. One of the classifications is maskable and

nonmaskable.

 If maskable, DSP can ignore when that interrupt is masked. Another

classification is vectored and non-vectored. If vectored, Interrupt Service
subroutine (ISR) is in specific location. In Software Interrupt, instruction is
written in the program.

149
INTERRUPT I/O

• In Hardware interrupt, a hardware pin, on the DSP IC will receive an interrupt

by the external device. Hardware interrupt is also referred to as external
interrupt and software interrupt is referred to as internal interrupt. Internal
interrupt may also be due to execution of certain instruction can causing
interrupt. In TMS320C54xx there are total of 30 interrupts.

• Reset, Non-maskable, Timer Interrupt, HPI, one each, 14 Software Interrupts,

4 External user Interrupts, 6 Mc-BSP related Interrupts and 2 DMA related
Interrupts. Host Port Interface (HPI) is a 8 bit parallel port. It is possible to
interface to a host Processor using HPI. Information exchange is through on-
chip memory of DSP which is also accessible Host processor.

150
INTERRUPT I/O

• Registers used in managing interrupts are Interrupt flag Register (IFR) and
Interrupt Mask Register (IMR). IFR maintains pending external & internal
interrupts. One in any bit position implies pending interrupt.

• Once an interrupt is received, the corresponding bit is set. IMR is used to

mask or unmask an interrupt. One implies that the corresponding interrupt is
unmasked. Both these registers are Memory Mapped Registers.

• One flag, Global enable bit (INTM), in ST1 register is used to enable or disable
all interrupts globally. If INTM is zero, all unmasked interrupts are enabled. If
it is one, all maskable interrupts are disabled.

151
INTERRUPT I/O FLOWCHART

152
DIRECT MEMORY ACCESS (DMA) OPERATION

• In any application, there is data transfer between DSP and memory and also
DSP and I/O device, as shown in fig. 7.10. However, there may be need for
transfer of large amount of data between two memory regions or between
memory and I/O. DSP can be involved in such transfer, as shown in fig. 7.11.

• Since amount of data is large, it will engage DSP in data transfer task for a
long time. DSP thus will not get utilized for the purpose it is meant for, i.e.,
data manipulation. The intervention of DSP has to be avoided for two
reasons: to utilize DSP for useful signal processing task and to increase the
speed of transfer by direct data transfer between memory or memory and
I/O. The direct data transfer is referred to as direct memory access (DMA).
The arrangement expected is shown in fig. 7.12. DMA controller helps in data
transfer instead of DSP.
153
DIRECT MEMORY ACCESS (DMA) OPERATION

154
REGISTER SUBADDRESS TECHNIQUE

•DMSDI is used when an automatic increment of the sub address is required

after each access. Thus it can be used to configure the entire set of registers.
DMSDN is used when single DMA register access is required. The following
examples bring out clearly the method of accessing the DMA registers and
transfer of data in DMA mode.
155
UNIT-5
The Q-notation, convolution, correlation, FIR filters, IIR filters, interpolation
filters, decimation filters, an FFT algorithm for DFT filters computation of the
signal spectrum.

156
UNIT-5 CLO’S

CLO 18 Understand the basic concepts of convolution and

correlation.
CLO 19 Compare the characteristics of IIR and FIR filters.
CLO 20 Analyze the concepts of interpolation and decimation
filters.

157
Q-NOTATION

• DSP algorithm implementations deal with signals and coefficients. To use a

fixed point DSP device efficiently, one must consider representing filter
coefficients and signal samples using fixedpoint2’s complement representation.
Ex: N=16, Range: -2N-1 to +2N-1 -1(-32768 to 32767).
• Typically, filter coefficients are fractional numbers. To represent such numbers,
the Q-notation has been developed.
• The Q-notation specifies the number of fractional bits.

158
Q-NOTATION

• A commonly used notation for DSP implementations is Q15. In the Q15

representation, the least significant 15 bits represent the fractional part of a
number.
• In a processor where 16 bits are used to represent numbers, the Q15 notation
uses the MSB to represent the sign of the number and the rest of the bits
represent the value of the number.
• In general, the value of a 16-bit Q15 number N represented as:

159
FIR FILTERS
• A finite impulse response (FIR) filter of order N can be described by the
difference equation.

• In digital signal processing, an FIR is a filter whose impulse response is of

finite period, as a result of it settles to zero in finite time.
• Filters are signal conditioners and function of each filter is, it allows an AC
components and blocks DC components.
• Due to efficiency and simplicity of the FIR filter, most commonly window
method is used.
• The other method sampling frequency method is also very simple to use, but
there is a small attenuation in the stop band.
160
LOGICAL STRUCTURE OF FIR FILTER

• FIR filter is used to implement almost any type of digital frequency response.
Usually these filters are designed with a multiplier, adders and a series of delays
to create the output of the filter.
• The following figure shows the basic FIR filter diagram with N length. The
result of delays operates on input samples. The values of hk are the coefficients
which are used for multiplication. So that the o/p at a time and that is the
summation of all the delayed samples multiplied by the appropriate
coefficients.

161
FIR FILTERS

•The implementation requires signal delay for each sample to compute the next
output, y(n+1), is given as y(n+1)=h(N-1)x(n-(N-2))+h(N-2)x(n-(N-3))+
...h(1)x(n)+h(0)x(n+1) Figure shows the memory organization for the
implementation of the filter.
•The filter Coefficients and the signal samples are stored in two circular buffers
each of a size equal to the filter. AR2 is used to point to the samples and AR3 to
the coefficients.
• In order to start with the last product, the pointer register AR2 must be
initialized to access the signal sample x(2-(N-1)), and the pointer register AR3 to
access the filter coefficient h(N-1). As each product is computed and added to the
previous result, the pointers advance circularly. At the end of the computation,
the signal sample pointer is at the oldest sample, which is replaced with the
newest sample to proceed with the next output computation. 162
FIR FILTERS

163
IIR FILTERS
• An infinite impulse response (IIR) filter is represented by a transfer function,
which is a ratio of two polynomials in z.
• To implement such a filter, the difference equation representing the transfer
function can be derived and implemented using multiply and add operations.
• To show such an implementation, we consider a second order transfer
function given by

164
IIR FILTERS

165
INTERPOLATION FILTERS

• An interpolation filter is used to increase the sampling rate. The interpolation

process involves inserting samples between the incoming samples to create
additional samples to increase the sampling rate for the output.
• One way to implement an interpolation filter is to first insert zeros between
samples of the original sample sequence. The zero-inserted sequence is then
passed through an appropriate low pass digital FIR filter to generate the
interpolated sequence.

166
INTERPOLATION FILTERS

The kind of interpolation carried out in the examples is called linear

interpolation because the convolving sequence h(n) is derived based on
linear interpolation of samples.
Further, in this case, the h(n) selected is just a second-order filter and
therefore uses just two adjacent samples to interpolate a sample.
A higher-order filter can be used to base interpolation on more input
samples.

167
INTERPOLATION FILTERS
• To implement an ideal interpolation. Figure shows how an interpolating filter
using a 15-tap FIR filter and an interpolation factor of 5 can be implemented. In
this example, each incoming samples is followed by four zeros to increase the
number of samples by a factor of 5.
•The interpolated samples are computed using a program similar to the one
used for a FIR filter implementation.
• One drawback of using the implementation strategy depicted in Figure is that
there are many multiplies in which one of the multiplying elements is zero. Such
multiplies need not be included in computation if the computation is
rearranged to take advantage of this fact.

168
INTERPOLATION FILTERS

• One such scheme, based on generating what are called poly-phase sub-filters,
is available for reducing the computation. For a case where the number of filter
coefficients N is a multiple of the interpolating factor L, the scheme
implements the interpolation filter using the equation.
• A scheme that uses poly-phase sub-filters to implement the
interpolating filter using the 15-tap FIR filter and an interpolation factor
of 5. In this implementation, the 15 filter taps are arranged as shown
and divided into five 3-tap sub filters.
• The input samples x(n), x(n-1) and x(n-2) are used five times to
generate the five output samples. This implementation requires 15
multiplies as opposed to 75 in the direct implementation of Figure

169
INTERPOLATION FILTERS

170
INTERPOLATION FILTERS

171
DECIMATION FILTERS

• A decimation filter is used to decrease the sampling rate. The decrease in

sampling rate can be achieved by simply dropping samples. For instance, if
every other sample of a sampled sequence is dropped, the sampling the rate of
the resulting sequence will be half that of the original sequence.
• The problem with dropping samples is that the new sequence may violate the
sampling theorem, which requires that the sampling frequency must be greater
than two times the highest frequency contents of the signal.
• To circumvent the problem of violating the sampling theorem, the signal to
be decimated is first filtered using a low pass filter.

172
DECIMATION FILTERS
•The cutoff frequency of the filter is chosen so that it is less than half the final
sampling frequency.
• The filtered signal can be decimated by dropping samples. In fact, the samples
that are to be dropped need not be computed at all.
•Thus, the implementation of a decimator is just a FIR filter implementation in
which some of the outputs are not calculated.
• Figure shows a block diagram of a decimation filter. Digital decimation can be
implemented as depicted in Figure for an example of a decimation filter with
decimation factor of 3. It uses a low pass FIR filter with 5 taps. The computation
is similar to that of a FIR filter. However, after computing each output sample,
the signal array is delayed by three sample intervals by bringing the next three
samples into the circular buffer to replace the three oldest samples.
173
DECIMATION FILTERS

Decimation process

174
DECIMATION FILTER

Implementation of decimation filter

175
INTRODUCTION TO FFT

• In a typical Signal Processing System, shown in fig 6.1 signal is processed

using DSP in the DFT domain. After processing, IDFT is taken to get the signal
in its original domain. Though certain amount of time is required for forward
and inverse transform, it is because of the advantages of transformed domain
manipulation, the signal processing is carried out in DFT domain.

• The transformed domain manipulations are sometimes simpler. They are also
more useful and powerful than time domain manipulation. For example,
convolution in time domain requires one of the signals to be folded, shifted
and multiplied by another signal, cumulatively. Instead, when the signals to
be convolved are transformed to DFT domain, the two DFT are multiplied and
inverse transform is taken. Thus, it simplifies the process of convolution.

176
INTRODUCTION

177
AN FFT ALGORITHM FOR DFT COMPUTATION

• As DFT / IDFT are part of signal processing system, there is a need for fast
computation of DFT / IDFT. There are algorithms available for fast
computation of DFT/ IDFT. There are referred to as Fast Fourier Transform
(FFT) algorithms.
• There are two FFT algorithms: Decimation-In-Time FFT (DITFFT) and
Decimation-In-Frequency FFT (DIFFFT). The computational complexity of both
the algorithms are of the order of log2(N). From the hardware / software
implementation viewpoint the algorithms have similar structure throughout
the computation. In-place computation is possible reducing the requirement
of large memory locations. The features of FFT are tabulated in the table 6.2.

178
AN FFT ALGORITHM FOR DFT COMPUTATION

179
EXAMPLE OF COMPUTATION OF 2 POINT DFT

•Consider an example of computation of 2 point DFT. The signal flow

graph of 2 point DITFFT Computation is shown in fig 6.2. The input /
output relations is as in eq (6.3).

180
BUTTERFLY STRUCTURE

•The Butterfly structure in general for DITFFT algorithm is shown in

fig. 6.3. The signal flow graph for N=8 point DITFFT is shown in fig.
4. The relation between input and output of any Butterfly structure
is shown in eq (6.4) and eq(6.5).

181
BUTTERFLY STRUCTURE
• Separating the real and imaginary parts, the four equations to be realized in
implementation of DITFFT Butterfly structure are as in eq(6.6).

182
SIGNAL FLOW GRAPH OF 8 POINT DITFFT COMPUTATION

183
SPECTRUM OF x(n)

•Observe that with N=2^M, the number of stages in signal flow graph=M,
number of multiplications = (N/2)log2(N) and number of additions =
(N/2)log2(N). Number of Butterfly Structures per stage = N/2. They are
identical and hence in-place computation is possible. Also reusability of
hardware designed for implementing Butterfly structure is
•possible.
184
SPECTRUM

185
PROBLEM
• Problem : What minimum size FFT must be used to compute a DFT of 40
points? What must be done to samples before the chosen FFT is applied?
What is the frequency resolution achieved?

• Solution: Minimum size FFT for a 40 point sequence is 64 point FFT. Sequence
is extended to 64 by appending additional 24 zeros. The process improves
frequency resolution from

186
OVERFLOW AND SCALING

• In any processing system, number of bits per data in signal processing is fixed
and it is limited by the DSP processor used. Limited number of bits leads to
overflow and it results in erroneous answer. InQ15 notation, the range of
numbers that can be represented is -1 to 1. If the value of a number exceeds
these limits, there will be underflow / overflow. Data is scaled down to avoid
overflow.

• However, it is an additional multiplication operation. Scaling operation is

simplified by selecting scaling factor of 2^-n. And scaling can be achieved by
right shifting data by n bits. Scaling factor is defined as the reciprocal of
maximum possible number in the operation. Multiply all the numbers at the
beginning of the operation by scaling factor so that the maximum number to
be processed is not more than 1. In the case of DITFFT computation
187
OVERFLOW AND SCALING - EXAMPLE

•To find the maximum possible value for LHS term, Differentiate and equate
to zero

•Thus scaling factor is 1/2.414=0.414. A scaling factor of 0.4 is taken so that it

can be implemented by shifting the data by 2 positions to the right. The
symbolic representation of Butterfly Structure is shown in fig. 6.8. The
complete signal flow graph with scaling factor is shown in fig. 6.9 188
OVERFLOW AND SCALING - EXAMPLE

189
BIT-REVERSED INDEX GENERATION

• As noted in table 6.2, DITFFT algorithm requires input in bit reversed order.
The input sequence can be arranged in bit reverse order by reverse carry add
operation. Add half of DFT size (=N/2) to the present bit reversed ndex to get
next bit reverse index. And employ reverse carry propagation while adding
bits from left to right. The original index and bit reverse index for N=8 is listed
in table 6.3

190

Ed Periodic Table
No ratings yet
Ed Periodic Table
1 page
Paper 199-Morse Code Translator Using The Arduino Platform
No ratings yet
Paper 199-Morse Code Translator Using The Arduino Platform
6 pages
Complexity Theory Between Place and Space
No ratings yet
Complexity Theory Between Place and Space
19 pages
Floating-Point To Fixed-Point Conversion For Audio
No ratings yet
Floating-Point To Fixed-Point Conversion For Audio
10 pages
Synthesis of Area Optimized 64 Bit Double Precision Floating Point Multiplier Using VHDL
No ratings yet
Synthesis of Area Optimized 64 Bit Double Precision Floating Point Multiplier Using VHDL
4 pages
Design and Implementation of Fast Floating Point Multiplier Unit
No ratings yet
Design and Implementation of Fast Floating Point Multiplier Unit
5 pages
10 MIPS Floating Point Arithmetic
No ratings yet
10 MIPS Floating Point Arithmetic
28 pages
IEEE 754 Floating Point Notes
No ratings yet
IEEE 754 Floating Point Notes
4 pages
Fix Point Implementation of Elementry Functions
No ratings yet
Fix Point Implementation of Elementry Functions
134 pages
Design of Double Precision IEEE-754 Floating-Point Units
100% (15)
Design of Double Precision IEEE-754 Floating-Point Units
73 pages
Cython Tutorial: Release 0.28.2
No ratings yet
Cython Tutorial: Release 0.28.2
81 pages
M68 HC 05
No ratings yet
M68 HC 05
332 pages
Asm
No ratings yet
Asm
156 pages
Factor K
No ratings yet
Factor K
10 pages
Introduction To Verilog HDL
No ratings yet
Introduction To Verilog HDL
38 pages
Power Amplifier Linearization Using Singular Value Decomposition Algorithm
No ratings yet
Power Amplifier Linearization Using Singular Value Decomposition Algorithm
4 pages
Micro Interfacing
No ratings yet
Micro Interfacing
15 pages
Instruction Manual: Digital Multimeter
No ratings yet
Instruction Manual: Digital Multimeter
269 pages
AS3842
No ratings yet
AS3842
10 pages
I2c Serial Protocol
100% (2)
I2c Serial Protocol
9 pages
I2c Slave
No ratings yet
I2c Slave
4 pages
Loop Gain Measurement
No ratings yet
Loop Gain Measurement
7 pages
Exploring The Best Indicators in TA-Lib - Technical Analysis of Stocks Using Python - Part 1 - by Himanshu Sharma - MLearning - Ai - Medium
No ratings yet
Exploring The Best Indicators in TA-Lib - Technical Analysis of Stocks Using Python - Part 1 - by Himanshu Sharma - MLearning - Ai - Medium
14 pages
An920 Rev2
No ratings yet
An920 Rev2
38 pages
Using The Mid-Range Enhanced Core PIC16 Devices' MSSP Module For Slave I C Communication
100% (2)
Using The Mid-Range Enhanced Core PIC16 Devices' MSSP Module For Slave I C Communication
14 pages
Basic Tutorials - Batteries For Solar Energy Systems
No ratings yet
Basic Tutorials - Batteries For Solar Energy Systems
3 pages
Ani C Bus Analyser To Let You Satisfy Your Curiosity: The Secrets of I C
No ratings yet
Ani C Bus Analyser To Let You Satisfy Your Curiosity: The Secrets of I C
7 pages
Introduction-to-JSON
No ratings yet
Introduction-to-JSON
1 page
07. Javascript Object Notation
No ratings yet
07. Javascript Object Notation
17 pages
Jap6 72 280-310 3BB-1
No ratings yet
Jap6 72 280-310 3BB-1
2 pages
Your First Code Using Mojo Programming Language
No ratings yet
Your First Code Using Mojo Programming Language
2 pages
Clem Engine Paper Presentation
No ratings yet
Clem Engine Paper Presentation
6 pages
Xyce Reference Guide
No ratings yet
Xyce Reference Guide
634 pages
Diodes Inc - Library - Components - List PDF
No ratings yet
Diodes Inc - Library - Components - List PDF
5 pages
Stream Gate
No ratings yet
Stream Gate
644 pages
Military HFGW Applications
100% (1)
Military HFGW Applications
45 pages
Desert Biome
No ratings yet
Desert Biome
5 pages
I2C
100% (3)
I2C
19 pages
DC-DC Converters Feedback and Control
100% (1)
DC-DC Converters Feedback and Control
220 pages
Servo Fundamentals
No ratings yet
Servo Fundamentals
12 pages
Implementation of PID Controllers On Motorola DSP PDF
No ratings yet
Implementation of PID Controllers On Motorola DSP PDF
84 pages
N LMS Impedance Bridge
No ratings yet
N LMS Impedance Bridge
7 pages
Li-Baker Detector
No ratings yet
Li-Baker Detector
32 pages
Steber An LMS Impedance Bridge
No ratings yet
Steber An LMS Impedance Bridge
7 pages
Floating Point
No ratings yet
Floating Point
26 pages
Greek Letters
No ratings yet
Greek Letters
15 pages
32 Bit Floating Point ALU
0% (1)
32 Bit Floating Point ALU
7 pages
Characterization of Au (OH) 3
No ratings yet
Characterization of Au (OH) 3
6 pages
Introduction To Verilog Hardware Description Language
No ratings yet
Introduction To Verilog Hardware Description Language
108 pages
Micro Controller Based Scientific Calculator
0% (1)
Micro Controller Based Scientific Calculator
6 pages
Cython
No ratings yet
Cython
35 pages
Hardinge Parts List
No ratings yet
Hardinge Parts List
104 pages
Trigwithouttears
No ratings yet
Trigwithouttears
40 pages
ENGR 451 Lab 2 BJT Differential Pair
No ratings yet
ENGR 451 Lab 2 BJT Differential Pair
9 pages
An Implementation of I2C Slave Interface Using Verilog HDL
No ratings yet
An Implementation of I2C Slave Interface Using Verilog HDL
6 pages
RS232 Communication With PIC Microcontroller
No ratings yet
RS232 Communication With PIC Microcontroller
5 pages
A Low Cost Automastic Impedance Bridge
No ratings yet
A Low Cost Automastic Impedance Bridge
4 pages
Inter Integrated Circuit
No ratings yet
Inter Integrated Circuit
14 pages
Analog Dialogue, Volume 47, Number 3
From Everand
Analog Dialogue, Volume 47, Number 3
Analog Dialogue
No ratings yet
CH3 Arm PPT New
No ratings yet
CH3 Arm PPT New
42 pages
Module 1 DSPA Chapter 2
No ratings yet
Module 1 DSPA Chapter 2
8 pages
LaplaceandFouriertransformconcepts
No ratings yet
LaplaceandFouriertransformconcepts
45 pages
Electrical Electronics Project Topics
No ratings yet
Electrical Electronics Project Topics
7 pages
C07.ConserveIt DataA4 XM34IO
No ratings yet
C07.ConserveIt DataA4 XM34IO
2 pages
B Tech Detailed TT
No ratings yet
B Tech Detailed TT
33 pages
Module-1 PPT Data Communication
100% (1)
Module-1 PPT Data Communication
168 pages
Dosing Signals
No ratings yet
Dosing Signals
1 page
LENZE 8200 Vector
100% (2)
LENZE 8200 Vector
290 pages
APU CSLLT - 2 - Number System Data Presentation
No ratings yet
APU CSLLT - 2 - Number System Data Presentation
40 pages
Unit 4-Transducers Overview and Applications
No ratings yet
Unit 4-Transducers Overview and Applications
17 pages
M.E., Cad/Cam Unit I For Non-Destructive Testing
No ratings yet
M.E., Cad/Cam Unit I For Non-Destructive Testing
17 pages
Am/fdm and Am/tdm
No ratings yet
Am/fdm and Am/tdm
22 pages
Lecture03 - 04 - EE2231 - Classification of Signals
No ratings yet
Lecture03 - 04 - EE2231 - Classification of Signals
6 pages
An Investigation Into The Capabilities of MATLAB Power System Toolbox For Small Signal Stability Analysis in Power Systems
No ratings yet
An Investigation Into The Capabilities of MATLAB Power System Toolbox For Small Signal Stability Analysis in Power Systems
7 pages
Bme 518
No ratings yet
Bme 518
18 pages
GRADE9 1stquaterly
No ratings yet
GRADE9 1stquaterly
2 pages
NITMZ Placement Brochure 2017
No ratings yet
NITMZ Placement Brochure 2017
31 pages
Standard Grade Computing Studies: Automated Systems
No ratings yet
Standard Grade Computing Studies: Automated Systems
24 pages
Module-2 Fourier Series
No ratings yet
Module-2 Fourier Series
28 pages
Multimedia QB-I
No ratings yet
Multimedia QB-I
14 pages
NSD70 Teleprotection
No ratings yet
NSD70 Teleprotection
8 pages
SMART CRADLE SYSTEM - Final 2
No ratings yet
SMART CRADLE SYSTEM - Final 2
26 pages
EHTC
100% (4)
EHTC
22 pages
Bandpass Signaling
No ratings yet
Bandpass Signaling
76 pages
DB D20 Amplifier User Guide
No ratings yet
DB D20 Amplifier User Guide
93 pages
L3 Turbo Injector Driver Module 3
No ratings yet
L3 Turbo Injector Driver Module 3
2 pages
Chapter 2 - Component Interconnection and Signal Conditioning - Part1
100% (1)
Chapter 2 - Component Interconnection and Signal Conditioning - Part1
27 pages
Shannon-Weaver Model
100% (1)
Shannon-Weaver Model
6 pages
KKT CRP Pinout
No ratings yet
KKT CRP Pinout
8 pages
GHI
No ratings yet
GHI
8 pages