Introduction to Digital Signal Processors (DSPs)_student
Introduction to Digital Signal Processors (DSPs)_student
IntroductiontotoDigital
DigitalSignal
SignalProcessors
Processors(DSPs)
(DSPs)
Introduction
A digital signal processor is basically an integrated circuit that takes real world signals
like audio, video, temperature, etc. that have been digitized and then mathematically
manipulates them.
It is a specialized microprocessor with an architecture optimized for the operational
needs of digital signal processing.
Digital signal processors have the following characteristics:
1. Real-time digital signal processing capabilities.
2. High throughput.
3. Predictable, repeatable behaviour.
4. Re-programmability by software.
5. Cost effective.
Evolution of Digital Signal Processors
DSPs appeared on the market in the early 1980s. DSPs are being used in several applications such as
communications and controls, graphics, and speech and image processing. They are also used in talking
toys, robots, music synthesizers, spectrum analyzers, adaptive systems and so on.
1. DSP Algorithms mold DSP Architectures: For nearly every feature found in DSPs, there are DSP
algorithms whose computation is eased by inclusion of this feature.
2. Fast Multipliers: Multiplication and accumulation (MAC) is the main component of filter algorithm
3. Multiple Execution Units: DSPs include several independent execution units for example, in addition
to the MAC unit; they contain an arithmetic logic unit (ALU), an address generation unit and a shifter.
(a) Registers: Registers hold intermediate and results of multiply-accumulate and other arithmetic
operations.
(b) Multiplier: A single-cycle multiplier is presented in all DSPs.
(c) ALU: DSPs arithmetic logic unit implements basic arithmetic and logical operations in a single
instruction cycle.
(d) Shifters: A shifter is often found immediately following the multiplier and ALU. Some shifters shifts by
one bit to the left, or to the right by one bit. Such shifters can perform multibit shifts one bit at a time, but
this can be time consuming.
A barrel shifter shifts by any number of bits in a single instruction cycle. Barrel shifter is especially useful in
the implementation of floating-point add and subtract operations.
Evolution of Digital Signal Processors contd...
4. Efficient memory Accesses: For good DSP performance, fast and efficient data access from
memory is required and the hardware implementations for this are
(a) High-bandwidth memory architectures: To execute a MAC in every clock cycle DSPs must have
the ability to fetch the MAC instruction, a data sample, and a filter coefficient from memory in a single
cycle. Hence, DSPs require high memory bandwidth.
(b) Specialized addressing modes: Addressing modes refers to the means by which the locations of
operands are specified. Most DSPs include one or more special address generation units that are
dedicated to calculating addresses.
(c) Direct memory access: DMA is a technique whereby data can be transferred to or from the
processor’s memory without the involvement of the processor itself.
5. Data Format: Digital signal processing can be separated into two categories: fixed point and
floating point. These designations refer to the format used to store and manipulate numeric
representations of data.
Evolution of Digital Signal Processors contd...
6. Zero-Overhead Looping: DSP algorithms frequently involve the repetitive execution of a small
number of instructions, e.g., FIR filtering. They are performed by repeatedly executing the same
instruction or sequence of instructions.
Zero-overhead looping allows the programmer to implement a loop without expending any clock cycles
for updating and testing the loop counter or branching back to the top of the loop.
Zero-overhead loops lose no time in incrementing or decrementing counters, checking to see if the
loop is finished, or branching back to the top of the loop.
This can result in considerable savings.
7. Streamlined I/O: Most DSPs provide a good selection of on-chip peripherals and peripheral
interfaces.
8. Specialized Instruction Set: The instruction set of a DSP processor is designed
by keeping two goals in mind, which are as follows: (i) the maximum use of the processor’s hardware
increases its efficiency, and (ii) minimize the amount of memory space required to store DSP
programs, since DSP applications are often quite cost-sensitive and the cost of memory contributes
substantially to overall system cost.
Digital Signal Processor Architecture
Von Neumann Architecture: Traditional microprocessors use the Von Neumann architecture,
named after the brilliant American mathematician John Von Neumann (1903-1957). It has the
following features.
• Von Neumann architecture is shown in Fig. 1. It contains a single memory and a single bus for
transferring data into and out of the central processing unit. Both program instructions and data
are stored in the single memory. In the simplest case, the processor can make one access
(either a read or a write) to memory during each instruction cycle.
• The Von Neumann design is satisfactory when one wants to execute all of the required tasks
in serial.
• One has to pay the price of increased complexity when other architectures are needed for very
fast processing.
Fig. 1.
Digital Signal Processor Architecture contd...
Harvard Architecture: Harvard Architecture is named after the work done at Harvard University
in the 1940s under the leadership of Howard Aiken (1900-1973). It has the following features.
• Harvard Architecture is shown in Fig. 2. It contains two independent memories; one memory
holds program instructions and the other holds data.
• The processor is connected to two independent memories via two independent sets of buses.
Since the buses operate independently, program instructions and data can be fetched at the
same time, which improve the speed over the single bus design.
• Two memory access can be made during one instruction cycle, thus execution can be faster.
Fig. 2.
Digital Signal Processor Architecture contd...
Super Harvard Architecture (SHARC): This term was coined by Analog Devices.
• As shown in Fig. 3, an instruction cache is added in the Harvard architecture to improve the
throughput. The instruction cache used to store instructions which will be reused such as the
instructions inside a repeated loop. This arrangement leaves both buses (program and data)
free for fetching operands.
• This extension (Harvard architecture plus cache) is called Super Harvard Architecture (SHARC).
Fig. 3
Digital Signal Processor Hardware Units
1. Multiplier and Accumulator (MAC) Unit: A typical MAC is shown in Fig. 4. Majority
of DSP applications require array multiplication, e.g., convolution and correlation. Array
multiplication can be performed by using a single multiplier and adder.
Fig. 4.
Digital Signal Processor Hardware Units contd...
2. Barrel Shifter: The shift registers of a conventional microprocessor require one clock
cycle for each shift. In DSP applications, several shifts are required in a single
execution cycle. A barrel shifter shifts data by several bits in one clock cycle. A barrel
shifter connects the input lines representing a word to a group of output lines with the
required shifts determined by its control inputs
Digital Signal Processor Hardware Units contd...
3. Address Generators:
• In DSPs two Data Address Generator (DAG) are used as shown at the
top of the Fig. 3. The PM DAG is used for the program memory and DM
DAG is used for the data memory.
• These two data address generators control the addresses sent to the
program and data memories.
• They specify the address where the data is to be read from or written to.
• In conventional microprocessors this task is handled by the program
sequencer.
• The DAGs are used to generate bit-reversed addresses into the circular
buffers to efficiently carry out the Fast Fourier transform.
Fixed-Point Digital Signal Processor
1. Fixed-Point Digital Signal Processor: Most high-volume, embedded
applications use fixed-point DSPs because the priority is low cost. Fixed-point
processor has the following Advantages.
• Fixed-point DSPs are used in a greater number of a high-volume applications
than floating-point DSPs, and therefore are typically less expensive that
floating-point DSPs due to the scale of manufacturing. System- on-a-chip (SOC)
variables, including on-board memory, integrated application-specific
peripherals, and connectivity options can also affect the cost - and functionality -
of both fixed-point and floating-point processors.
• They are smaller in size.
• They are less power consuming
Floating-Point Digital Signal Processor
costlier than the fixed-point processors. They are easier to program because the programmer
does not have to be concerned about dynamic range and precision. Their cost increases
because of the complex circuitry. A floating-point processor has the following advantages.
• Floating-point DSPs have a much larger dynamic range.
• Floating-point processing yields much greater precision than fixed-point processing.
• It is generally easier to develop algorithms for floating-point DSPs, as fixed point-
algorithms require greater manipulation to compensate for quantization noise.
• They have high signal to noise ratio (30 million to one in comparison to ten thousand to
one in case of fixed-point DSP).
Pipelining
INSTRUCTION 2:
INSTRUCTION 3:
INSTRUCTION 4:
• Most DSPs are pipelined, the depth (number of stages) of the pipeline may vary
from one processor to another. In general, a deeper pipeline allows the
processor to execute faster but makes the processor harder to program. The
pipeline conflict arises when different instructions share resources within the
same cycle.
Memory Access schemes in DSPs
1. Multiple Access Memory:
• The multiple access memory allows more than one access in a single clock cycle.
• The dual access RAM (DA-RAM) allows two memory accesses in a single clock cycle.
The DARAM is connected to the DSP processor with two address and two data buses
independently. This gives us four memory accesses in a single clock period.
• The Harvard architecture allows multiple access memories to be interfaced toDSP
processors.
2. Multiport memory:
• The multiport memory has the facility of interfacing multiple address and data buses.
• With the help of dual port memory, the program and data can be stored in a single
memory chip and they can be accessed simultaneously.
• The multiport memories have more number of pins and larger chip area which makes
them more expensive and large in size.
Very Long Instruction Word (VLIW) Architecture
• VLIW architecture provide many execution units, each of which executes its own instruction.
• VLIW architecture executes multiple instructions in parallel. To execute multiple instructions in
parallel, VLIW processors must have sufficient decoders, buses, registers, and memory
bandwidth.
• VLIW processors use wide buses to access data memory and keep the multiple execution units
fed with data.
• VLIW processors consume high energy.
• VLIW processors have mainly targeted applications which have very demanding computational
requirements but are not very sensitive to cost or energy efficiency.
• The VLIW architecture consists of multiported register file. This file is used for fetching the
operands and storing the results.
• The functional units can access the multiported register file with the help of Read/Write cross
bar.
• The program control unit provides the control instructions that executes independent parallel
operations.
Addressing Modes
The addressing mode specifies a rule by which the location of operands are specified for instructions. The
addressing modes tell us how the address part of the instruction is used to compute the effective address.
The effective address is defined to be the memory address obtained from the computation dictated by the
given addressing mode. DSPs have the following addressing modes:
1. Implied Addressing: Implied addressing means that the operand addresses are implied by the
instructions; there is no choice of operand locations.
2. Immediate Addressing: With immediate addressing, the operand itself (as opposed to the location
where the operand is stored) is encoded in the instruction word or in a separate word that follows the
instruction word.
3. Memory-Direct Addressing: In this mode, the effective address is equal to the address part of the
instruction. The operand resides in memory and its address is given by the address field of the instruction.
4. Register-Direct Addressing: With register-direct addressing, the data being addressed reside in a
register. The programmer specifies the register as part of the instruction.
Addressing Modes contd...