0% found this document useful (0 votes)
9 views

Introduction to Digital Signal Processors (DSPs)_student

Uploaded by

harshucares
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Introduction to Digital Signal Processors (DSPs)_student

Uploaded by

harshucares
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction

IntroductiontotoDigital
DigitalSignal
SignalProcessors
Processors(DSPs)
(DSPs)
Introduction
A digital signal processor is basically an integrated circuit that takes real world signals
like audio, video, temperature, etc. that have been digitized and then mathematically
manipulates them.
It is a specialized microprocessor with an architecture optimized for the operational
needs of digital signal processing.
Digital signal processors have the following characteristics:
1. Real-time digital signal processing capabilities.
2. High throughput.
3. Predictable, repeatable behaviour.
4. Re-programmability by software.
5. Cost effective.
Evolution of Digital Signal Processors
DSPs appeared on the market in the early 1980s. DSPs are being used in several applications such as
communications and controls, graphics, and speech and image processing. They are also used in talking
toys, robots, music synthesizers, spectrum analyzers, adaptive systems and so on.
1. DSP Algorithms mold DSP Architectures: For nearly every feature found in DSPs, there are DSP
algorithms whose computation is eased by inclusion of this feature.
2. Fast Multipliers: Multiplication and accumulation (MAC) is the main component of filter algorithm

DSPs perform a MAC operation in a single instruction cycle.


The MAC is useful in computing a vector product, such as convolution, correlation, and Fourier
transforms.
All modern DSPs include at at least one dedicated multiply-accumulate (MAC) unit.
Evolution of Digital Signal Processors contd...

3. Multiple Execution Units: DSPs include several independent execution units for example, in addition
to the MAC unit; they contain an arithmetic logic unit (ALU), an address generation unit and a shifter.
(a) Registers: Registers hold intermediate and results of multiply-accumulate and other arithmetic
operations.
(b) Multiplier: A single-cycle multiplier is presented in all DSPs.
(c) ALU: DSPs arithmetic logic unit implements basic arithmetic and logical operations in a single
instruction cycle.
(d) Shifters: A shifter is often found immediately following the multiplier and ALU. Some shifters shifts by
one bit to the left, or to the right by one bit. Such shifters can perform multibit shifts one bit at a time, but
this can be time consuming.
A barrel shifter shifts by any number of bits in a single instruction cycle. Barrel shifter is especially useful in
the implementation of floating-point add and subtract operations.
Evolution of Digital Signal Processors contd...
4. Efficient memory Accesses: For good DSP performance, fast and efficient data access from
memory is required and the hardware implementations for this are
(a) High-bandwidth memory architectures: To execute a MAC in every clock cycle DSPs must have
the ability to fetch the MAC instruction, a data sample, and a filter coefficient from memory in a single
cycle. Hence, DSPs require high memory bandwidth.
(b) Specialized addressing modes: Addressing modes refers to the means by which the locations of
operands are specified. Most DSPs include one or more special address generation units that are
dedicated to calculating addresses.
(c) Direct memory access: DMA is a technique whereby data can be transferred to or from the
processor’s memory without the involvement of the processor itself.
5. Data Format: Digital signal processing can be separated into two categories: fixed point and
floating point. These designations refer to the format used to store and manipulate numeric
representations of data.
Evolution of Digital Signal Processors contd...
6. Zero-Overhead Looping: DSP algorithms frequently involve the repetitive execution of a small
number of instructions, e.g., FIR filtering. They are performed by repeatedly executing the same
instruction or sequence of instructions.
Zero-overhead looping allows the programmer to implement a loop without expending any clock cycles
for updating and testing the loop counter or branching back to the top of the loop.
Zero-overhead loops lose no time in incrementing or decrementing counters, checking to see if the
loop is finished, or branching back to the top of the loop.
This can result in considerable savings.
7. Streamlined I/O: Most DSPs provide a good selection of on-chip peripherals and peripheral
interfaces.
8. Specialized Instruction Set: The instruction set of a DSP processor is designed
by keeping two goals in mind, which are as follows: (i) the maximum use of the processor’s hardware
increases its efficiency, and (ii) minimize the amount of memory space required to store DSP
programs, since DSP applications are often quite cost-sensitive and the cost of memory contributes
substantially to overall system cost.
Digital Signal Processor Architecture
Von Neumann Architecture: Traditional microprocessors use the Von Neumann architecture,
named after the brilliant American mathematician John Von Neumann (1903-1957). It has the
following features.
• Von Neumann architecture is shown in Fig. 1. It contains a single memory and a single bus for
transferring data into and out of the central processing unit. Both program instructions and data
are stored in the single memory. In the simplest case, the processor can make one access
(either a read or a write) to memory during each instruction cycle.
• The Von Neumann design is satisfactory when one wants to execute all of the required tasks
in serial.
• One has to pay the price of increased complexity when other architectures are needed for very
fast processing.

Fig. 1.
Digital Signal Processor Architecture contd...

Harvard Architecture: Harvard Architecture is named after the work done at Harvard University
in the 1940s under the leadership of Howard Aiken (1900-1973). It has the following features.
• Harvard Architecture is shown in Fig. 2. It contains two independent memories; one memory
holds program instructions and the other holds data.
• The processor is connected to two independent memories via two independent sets of buses.
Since the buses operate independently, program instructions and data can be fetched at the
same time, which improve the speed over the single bus design.
• Two memory access can be made during one instruction cycle, thus execution can be faster.

Fig. 2.
Digital Signal Processor Architecture contd...
Super Harvard Architecture (SHARC): This term was coined by Analog Devices.
• As shown in Fig. 3, an instruction cache is added in the Harvard architecture to improve the
throughput. The instruction cache used to store instructions which will be reused such as the
instructions inside a repeated loop. This arrangement leaves both buses (program and data)
free for fetching operands.
• This extension (Harvard architecture plus cache) is called Super Harvard Architecture (SHARC).

Fig. 3
Digital Signal Processor Hardware Units
1. Multiplier and Accumulator (MAC) Unit: A typical MAC is shown in Fig. 4. Majority
of DSP applications require array multiplication, e.g., convolution and correlation. Array
multiplication can be performed by using a single multiplier and adder.

Fig. 4.
Digital Signal Processor Hardware Units contd...
2. Barrel Shifter: The shift registers of a conventional microprocessor require one clock
cycle for each shift. In DSP applications, several shifts are required in a single
execution cycle. A barrel shifter shifts data by several bits in one clock cycle. A barrel
shifter connects the input lines representing a word to a group of output lines with the
required shifts determined by its control inputs
Digital Signal Processor Hardware Units contd...
3. Address Generators:
• In DSPs two Data Address Generator (DAG) are used as shown at the
top of the Fig. 3. The PM DAG is used for the program memory and DM
DAG is used for the data memory.
• These two data address generators control the addresses sent to the
program and data memories.
• They specify the address where the data is to be read from or written to.
• In conventional microprocessors this task is handled by the program
sequencer.
• The DAGs are used to generate bit-reversed addresses into the circular
buffers to efficiently carry out the Fast Fourier transform.
Fixed-Point Digital Signal Processor
1. Fixed-Point Digital Signal Processor: Most high-volume, embedded
applications use fixed-point DSPs because the priority is low cost. Fixed-point
processor has the following Advantages.
• Fixed-point DSPs are used in a greater number of a high-volume applications
than floating-point DSPs, and therefore are typically less expensive that
floating-point DSPs due to the scale of manufacturing. System- on-a-chip (SOC)
variables, including on-board memory, integrated application-specific
peripherals, and connectivity options can also affect the cost - and functionality -
of both fixed-point and floating-point processors.
• They are smaller in size.
• They are less power consuming
Floating-Point Digital Signal Processor

1. Floating-Point Digital Signal Processor: The floating-point processors are generally

costlier than the fixed-point processors. They are easier to program because the programmer
does not have to be concerned about dynamic range and precision. Their cost increases
because of the complex circuitry. A floating-point processor has the following advantages.
• Floating-point DSPs have a much larger dynamic range.
• Floating-point processing yields much greater precision than fixed-point processing.
• It is generally easier to develop algorithms for floating-point DSPs, as fixed point-
algorithms require greater manipulation to compensate for quantization noise.
• They have high signal to noise ratio (30 million to one in comparison to ten thousand to
one in case of fixed-point DSP).
Pipelining

❑ Pipelining is a technique which is used to increase the performance of a processor by


breaking an instruction into different phases of operation and executing several phases from
different instructions in parallel.
❑ This approach decreases the overall time required to complete the set of operations.
❑ The four phases of operation are as follows.
• Fetch (F) an instruction from the program memory.
• Decode (D) the instruction, that is, determine what the instruction is supposed to do.
• Read (R) a data operand from or write a data operand to memory.
• Execution (E) the instruction.
Pipelining contd...
• Pipelining in DSPs allows different functional units to work simultaneously in the
same clock cycle by overlapping different phases from different instructions.
• Figure 6 shows the execution of four instructions and the pipeline associated
with these instructions.
INSTRUCTION 1:

INSTRUCTION 2:

INSTRUCTION 3:

INSTRUCTION 4:

Once the pipeline is full, it completes one instruction per cycle

• Most DSPs are pipelined, the depth (number of stages) of the pipeline may vary
from one processor to another. In general, a deeper pipeline allows the
processor to execute faster but makes the processor harder to program. The
pipeline conflict arises when different instructions share resources within the
same cycle.
Memory Access schemes in DSPs
1. Multiple Access Memory:
• The multiple access memory allows more than one access in a single clock cycle.
• The dual access RAM (DA-RAM) allows two memory accesses in a single clock cycle.
The DARAM is connected to the DSP processor with two address and two data buses
independently. This gives us four memory accesses in a single clock period.
• The Harvard architecture allows multiple access memories to be interfaced toDSP
processors.
2. Multiport memory:
• The multiport memory has the facility of interfacing multiple address and data buses.
• With the help of dual port memory, the program and data can be stored in a single
memory chip and they can be accessed simultaneously.
• The multiport memories have more number of pins and larger chip area which makes
them more expensive and large in size.
Very Long Instruction Word (VLIW) Architecture

• VLIW architecture provide many execution units, each of which executes its own instruction.
• VLIW architecture executes multiple instructions in parallel. To execute multiple instructions in
parallel, VLIW processors must have sufficient decoders, buses, registers, and memory
bandwidth.
• VLIW processors use wide buses to access data memory and keep the multiple execution units
fed with data.
• VLIW processors consume high energy.
• VLIW processors have mainly targeted applications which have very demanding computational
requirements but are not very sensitive to cost or energy efficiency.
• The VLIW architecture consists of multiported register file. This file is used for fetching the
operands and storing the results.
• The functional units can access the multiported register file with the help of Read/Write cross
bar.
• The program control unit provides the control instructions that executes independent parallel
operations.
Addressing Modes
The addressing mode specifies a rule by which the location of operands are specified for instructions. The
addressing modes tell us how the address part of the instruction is used to compute the effective address.
The effective address is defined to be the memory address obtained from the computation dictated by the
given addressing mode. DSPs have the following addressing modes:
1. Implied Addressing: Implied addressing means that the operand addresses are implied by the
instructions; there is no choice of operand locations.
2. Immediate Addressing: With immediate addressing, the operand itself (as opposed to the location
where the operand is stored) is encoded in the instruction word or in a separate word that follows the
instruction word.
3. Memory-Direct Addressing: In this mode, the effective address is equal to the address part of the
instruction. The operand resides in memory and its address is given by the address field of the instruction.
4. Register-Direct Addressing: With register-direct addressing, the data being addressed reside in a
register. The programmer specifies the register as part of the instruction.
Addressing Modes contd...

5. Register-Indirect Addressing: With register-indirect addressing, the data being addressed


reside in memory, and the address of the memory location containing the data is held in a
register.
6. Bit-Reversed Addressing: An addressing mode in which the order of the bits used to form a
memory address is reversed. Bit-reversed addressing eliminates the need for a separate bit-
reversing procedure in an FFT implementation. This simplifies reading the output from radix-2
FFT algorithms, which produce their results in a scrambled order.
7. Circular Addressing: Circular addressing is the most useful and sophisticated addressing
mode. In this mode, specified buffers in memory are accessed sequentially with a pointer that
automatically wraps around to the beginning of the buffer when the last location is accessed.
The TMS320 Family
The combinations of the TMS320’s high degree of parallelism and its specialized DSP
instruction set facilitates the speed and flexibility to develop a CMOS microprocessor family
capable of executing more than 50 million floating-point operations per second (MFLOPS).
❑ The TMS320 family includes several generations of programmable processors with
several devices in each generation.
❑ The TMS320 family consists of fixed-pointy, floating-point and multiprocessor DSPs.
❑ In TMS320 family, the multiprocessor DSPs are- TMS320 C8X; floating-point DSPs are-
TMS320 C3X, C4X, C6X; fixed-point DSPs are- TMS320 C1X, C2X, C2XX, C5X, C54X.
❑ The characteristics that are common for TMS320 family of DSPs are as follows.
• High speed performance.
• Flexibility in internal operations.
• Flexibility in instruction set.
• Cost effectiveness.
• Innovative parallel architecture.
Interfacing
1. External Memory Interfacing: The digital signal processor provides limited
on-chip memory for program and data storage. Large program code and
data are stored in external memory. The external memory interfaces with the
digital signal processor through the DMA controller, and program-memory
and data-memory controllers.
2. Serial-Port Interfacing: In DSP applications, the processor handles
multiple sources of data from other devices. The serial-port operates 1-bit at
a time. It is used in sending and receiving data between the processor and
the ADC, DAC, and CODEC. Therefore, with the help of DMA controller, the
digital signal processors can receive and transmit serial data in real time
between memory and I/O ports without interrupting the processors.
Interfacing contd...
3. Parallel-Port Interfacing; The digital signal processors can receive and
transmit multiple data bits at a time with the help of a parallel port. The
parallel port can transfer more data at a faster rate than the serial port.
Parallel port requires more pins for transferring multiple bits and
handshake lines for synchronization.
4. Host-Port Interfacing: A host port is a bidirectional port that is used to
interface the host processor with the digital signal processor. The host
processor may be a general-purpose microprocessor, a micro-controller, or
another digital signal processor. Host-Port Interfacing is used to
communicate with other processors that have different bus standards.

You might also like