0% found this document useful (0 votes)
19 views32 pages

3-Embedded Software Development-10-08-2023

The document discusses finite word length effects in digital signal processing. It covers several topics: 1) Digital signal processing classifications based on sampling rate, digitization, and number representation schemes like fixed point and floating point. 2) Errors that can occur due to finite word lengths like arithmetic errors, overflow, and saturation. 3) Implications of finite word lengths on DSP computations and the need to select appropriate processor speeds and data types. 4) Formats for representing numbers like unsigned/signed fixed point, floating point, and the IEEE 754 standard. Examples of number conversions are provided.

Uploaded by

Dhruv Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views32 pages

3-Embedded Software Development-10-08-2023

The document discusses finite word length effects in digital signal processing. It covers several topics: 1) Digital signal processing classifications based on sampling rate, digitization, and number representation schemes like fixed point and floating point. 2) Errors that can occur due to finite word lengths like arithmetic errors, overflow, and saturation. 3) Implications of finite word lengths on DSP computations and the need to select appropriate processor speeds and data types. 4) Formats for representing numbers like unsigned/signed fixed point, floating point, and the IEEE 754 standard. Examples of number conversions are provided.

Uploaded by

Dhruv Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Finite Word Length Effects

K Selvakumar

Assistant Professor
Department of Instrumentation and Control Systems Engineering
PSG College of Technology

September 12, 2015 DSP Workshop 1


Overview

• Digital Signal Processing (Application classification based on sampling rate)


• Digitization
• Fixed point and floaing point IEEE 754 number representation
• Finite word length effects
• Arithmetic errors, overflow and saturation
• Implications on computation

September 12, 2015 DSP Workshop 2


Motivation

September 12, 2015 DSP Workshop 3


Why Digital Processing?

• Immunity to Noise
7→ Analog signals are allowed to take any value within a particular range and
noise can easily alter the magnitude
7→ Digital signals take binary values and for altering a 1 to 0 and vice versa
a noise voltage of large magnitude is required
• Reprogrammable capability
7→ In analog case rewiring/ resoldering is required
• No performance drift in the field

September 12, 2015 DSP Workshop 4


Analog to Digital Conversion

• Takes an analog voltage as its input and produces a digital number representing
that voltage as output
• The output of ADC is a stream of sampled fixed word length values
• Number of samples is determined by the ADCs clock
• To calculate the input voltage from the output code

VFullRange
Vin = (1)
2N − 1
where N is number of bits used in ADC output
• The resolution of these samples is limited to the output data word width of the
ADC (Example: 8-bit ADC, 10-bit ADC)

September 12, 2015 DSP Workshop 5


• Q1: A continuous video voltage signal is to be converted into a discrete signal. The
range of the signal after amplification is 0 to 5V. The ADC has an 8-bit capacity.
Determine the number of quantization levels, resolution, and the quantization error.
Indicate how he the voltage signal is represened in binary form.
• Q2: What if the range of the signal is -5 to +5V and 4-bit ADC is used?

September 12, 2015 DSP Workshop 6


ADC: Unipolar and Bipolar output Coding
• Unsigned binary encoding is used to represent unipolar output
• 2s complement binary encoding is typically used to represent bipolar output
Unipolar Input(V) 12-bit ADC output
>= Vref 4095
1
Vref 2048
2
0 0

Bipolar Input(V) 12-bit ADC output


>= Vref 2047(0x7FF)
1
Vref 1024(0x400)
2
0 0
1
− Vref -1024(0xFC00)
2
<= Vref -2048(0xF800)
September 12, 2015 DSP Workshop 7
Sampling Frequency

• How often to measure the analog signal to represent it accurately


• Sample rate is number of samples taken per second
1
• Sampling period: T =
fs
where fs is sampling frequency
• Q3: If the sampling frequency is speech signal is 8KHz,what is the sampling interval?

September 12, 2015 DSP Workshop 8


Influence in Processor Selection

• 125µs are available to perform all the processing necessary before next sample
arrives
• Samples are arriving on a continuos basis and we can’t fall behind
• This is a common constaint for any real time systems
• Helps to determine the processor speed to keep up with this sampling rate
T
Number of Instructions per sample=
Insruction cycle time
• Q4: If a 100MHz processor executes one instruction per cycle, how many instruc-
tions can be executed in 125µs?

September 12, 2015 DSP Workshop 9


How to choose sampling frequency for an analog signal?

• If the signal is not sampled often enough,the information will not be the represen-
tative of true signal and it causes an aliasing problem
• Over sampling leads to huge cost:memory,computation,power consumption
• Nyquist crieria sets a lower bound for the sampling rate
• Anti-aliasing analog filter helps to achieve this!
fs
Cut-off frequency=Nyquist frequency [ ]
2
• Sampling frequency helps to select ADC’s clock
• Q5: What is the sampling rate of a music signal?

September 12, 2015 DSP Workshop 10


Digital represntation of a signal

• How to store x(n) in the memory?

September 12, 2015 DSP Workshop 11


Data types

• int:Integer data type is used represent positive and negative whole numbers in 16
bits
• char :Generally used to store ASCII values in 8-bit
• float: It is used to store positive and negatve real numbers in 32 bits
• double: Stores real numbers using 64 bits
• Q6: How much memory is required to store 10 seconds of ECG data using float
and double data types?

September 12, 2015 DSP Workshop 12


Fixed point number Systems

• Number systems refers to the format usd to store and manipulate numeric
representation of data
• Fixed point numbers are represented with a fixed number of digits after the
decimal point Examples: 123.45,1234.56,12345.67
• In fixed point representation the gap between adjacent numbers are equal
• In base2, binary point is the equivalent of a decimal point and it seperates integer
and fraction
• Various ways of 16 bit fixed point representations are
{-1 to 1;0 to 1;0 to 65,535;-32,768 to 32,767 }
• Q7: How signed -1 is stored in the memory?

September 12, 2015 DSP Workshop 13


Floating point number Systems

• In floating point representation, the placement of the decimal point can float
Examples:1.23467,123456.7,0.00001234567,1234567000000000
• The gaps between adjacent numbers are not uniformly spaced
• It assures much larger dynamic range and greater precision
• Approximately, the gap between two consecutive numbers is 1 part in 10 million
• Fixed point microprocessors (ex. x86) are designed to understand only integers
• Q8: How real numbers are stored in the memory?

September 12, 2015 DSP Workshop 14


IEEE-754 Floating point format

• Numbers represented in scientific notation are normalized so that only single


non-zero digit left to the decimal point Example:5.321 × 106
• Floating point representation also uses similar technique but base2
• Format for 32 bit numbers called single precision and 64 bit numbers called
double precision
• There are 4294967296 patterns for any 32-bit format and 18446744073709551616
patterns for the 64-bit format.

September 12, 2015 DSP Workshop 15


32-bit floating point format

• Bits 0 through 22 form the mantissa, bits 23 through 30 form the exponent, and
bit 31 is the sign bit
• The 24 bit mantissa is used for precision while the exponent is for extending the
dynamic range
• Using these bits, the floating point number N is formed by

N = (−1)S × 1.M × 2e−127 (2)


• The exponent is biased by 127 to get the range 2−127 to 2128

September 12, 2015 DSP Workshop 16


Example 1

• Consider a real number 329.390625 for single precision floating point


represenation
• Obtain the equivalent binary and then normalize it as given below;

7→ 101001001.011001
7→ 1.01001001011001 × 28
• The sign bit is positive, so s = 0
• The exponent is 135 (8+127), so e = 10000111
• The mantissa field m is 01001001011001 (leading 1 is not included since its
implied)
• The 32 bit representation is
0 10000111 010 0100 1011 0010 0000 0000

September 12, 2015 DSP Workshop 17


Example 2

• Consider the following 32-bit single precision pattern


1 1011 0110 011 0000 0000 0000 0000 0000
• The equivalent decimal value is

7→ (−1)1 × 210110110−01111111 × 1.011


7→ −1.375 × 255
7→ −49539595901075456.0
7→ −4.9539595901076 × 1016
• Q9: Convert the 32 bit single precision value 0xC0B40000 into a decimal value.
(Ans:-5.625)

September 12, 2015 DSP Workshop 18


Spacing in single precision representation
00110111010011110000001000000000 7→ 0.00001233862713
00110111010011110000001000000001 7→ 0.00001233862804

spacing=0.00000000000091

01000100000111110000001000000010 7→ 636.0313110
01000100000111110000001000000011 7→ 636.0313720

spacing=0.0000610

01001101010011110000001000000000 7→ 217063424.0
01001101010011110000001000100001 7→ 217063440.0

spacing=16.0

• Q10: Can we represent a decimal number 217063426.0 in float?


September 12, 2015 DSP Workshop 19
Range of float

• Exponents 0000000 and 11111111 are reserved


• Smallest value:
Exponent: e 7→ 00000001 7→ 1 − 127 7→ −126
Mantissa:00000000000000000000000 7→ 1.0
Value: ±1.0 × 2−126 ≈ 1.2 × 10−38
• Largest value:
Exponent: e 7→ 11111110 7→ 254 − 127 7→ +127
Mantissa:11111111111111111111111 7→≈ 2.0
Value: ±2.0 × 2+127 ≈ 3.4 × 10+38
• Approximately, 6 decimal digits of precision

September 12, 2015 DSP Workshop 20


Range of double

• In double precision format 11 bits are used for exponent and 52 bits are used for
mantissa
• Smallest value:±1.0 × 2−1022 ≈ 2.2 × 10−308
• Largest value: ±2.0 × 2+1023 ≈ 1.8 × 10+308
• Approximately, 16 decimal digits of precision
• Mainly used to maintain accuracy over many iterative calculations, manipulaing
very large and very small values
• In general a real number may have infinite information content, but it can’t be
stored and processed in the computer

September 12, 2015 DSP Workshop 21


Arithmetic computation error

• In general, multiplication, addition, and shift operations performed on a sequence


of n-bit values
• Some cases, their result would require more than n-bits
7→ occurs at when a result is larger than the highest positive or negative value of
word length of processor
• Overflow is indicated in carry and overflow flags of microprocessor’s status register
• It is upto the programmer to know which flag to check after the math is done

September 12, 2015 DSP Workshop 22


Overflow Illustration for 8-bit addition

• OF flag indicates if signed value exceeds +127 or less than -128, CF indicates if
result exceeds 255, and SF indicate that if result goes below 0.
September 12, 2015 DSP Workshop 23
Saturation

• Overflow rsults in a ’wrap-around’ phenomenon


0x7fffffff(r1)+1(r2)=0x80000000(r0) and it leads to incorrect results and
compromise on program’s reliability
• Algorithm designers have to be careful not to exceed the maximum representable
value in a n-bit integer
• To minimize this effect, result has to be saturated as maximum value
• Some processor architectures allow automatic saturation by hardware
• In ARM processor following instruction saturates the result as given below;
QADD r0,r1,r2;7→ r0=0x7fffffff
• Q11: What to do if processor does not have hardware saturation support?

September 12, 2015 DSP Workshop 24


Finite word length effects

• Analog to digital conversion noise


• Quantization error of arithmetic computations from truncation and rounding
7→ When a result need to be stored the native data word length of the processor
introducing an error
• Storage of real numbers
• Q12: In the given array, find index of the maximum number using float and
double datatypes.
[217063424, 217063425, 217063426]

September 12, 2015 DSP Workshop 25


Illusration 1

• The round-off error from each of the arithmetic operations causes the result value
to drift away from the desired value
• If the error is predominately of the same sign, the value of the variable drift away
much more rapidly
1: c=1;
2: for i=1:10000
3: a=rand(1);
4: b=rand(1);
5: c=c+a;c=c+b;c=c-a;c=c-b;
6: end
• Q13: Plot the variable c for each iteration and observe the results.

September 12, 2015 DSP Workshop 26


Illusration 2

1: a=[10000.0, 1.0, 10000.0];


2: b=[10000.0,1.0, -10000.0]
3: sum=0
4: for i=1:3
5: sum=sum+(a(i)*b(i))
6: end

• Q14: Analyze the sum variable for each iteration for float and double datatypes.

September 12, 2015 DSP Workshop 27


Implications on computation

• Integers are less useful in many applications


• In fixed point processors, real numbers are manipulated via floating point
emulation i.e, through software
• Floating point emulation costs huge computation, memory and power
consumption
• Fixed point DSP pocessor architecture is based on representing and operating on
numbers represented in integer format Ex. TMS320C6416 32-bit processor
• Floating point DSP: Processor architecture is based on representing and operating
on numbers represented in floating point format Ex. TMS320C6713 32-bit
processor

September 12, 2015 DSP Workshop 28


Processor Selection

• Q15:The typical fixed-point C54x Digital Signal Processor can execute a single
instruction in 10nsec. The user written L-tap FIR filter can execute in ≈ 38 + L
instruction cycles per input sample.
1. What is the maximum bandwidth of the signal that can be filtered with an FIR
filter of order L = 255?
2. If speech signal is sampled at 8KHz, what is the highest FIR filter order that
may be used in real time stream processing?

September 12, 2015 DSP Workshop 29


Concluding remarks

• Although 64-bit machines are available today, finite word length effects have to be
seriouly considered for resource constraint embedded systems
• In small microprocessors used in embedded systems, manual fixed point arithmetic
is used
• Awarness on finite word length effects is very useful in ADC, processor and data
type selection.
• It is a first step towards any real-time signal processing system design

September 12, 2015 DSP Workshop 30


References

1 Steven W Smith, The Scientist and Engineer’s Guide to Digital Signal


Processing.
2 Robert Oshana, DSP Software Development Techniques for Embedded and Real-
Time Systems, Newnes Publishers.
3 Richare G Lyons, Understanding Digital Signal Processing,Pearson Education.

September 12, 2015 DSP Workshop 31


Thank you

September 12, 2015 DSP Workshop 32

You might also like