Unit 3

Uploaded by

Mohammed Imran I 6010

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

23 views

Unit 3

Uploaded by

Mohammed Imran I 6010

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 25

UNIT Ill EMBEDDED PROGRAMMING feasue | — Components for embedded programs - Models of programs- Assembly, linking and loading - compilation techniques.- Program level performance analysis - Software performance optimization - Program level energy and power analysis and optimization - Analysis and optimization of program size - Program validation and testing. 3.1, INTRODUCTION The Creation of embedded programs is at the heart of embedded system design. | The microprocessor is an important elements of the embedded computing system, but it cannot do its job without memories and I/O devices. We need to understand how to interconnect microprocessors and devices using the CPU bus. Embedded code must not only provide rich furictionality, it must also run at a required rate to meet system deadlines, fit into the allowed amount of memory and meet power consumption requirements. To make a complete design we have to analyse many things that will cover upcoming topics in this unit. 3.2. BASIC COMPUTING PLATFORMS The platform provides the environment in which we can develop’ our embedded application. It has both Hardware and software components. Both are most important because one without the other is generally not very useful. 3.2.1. PLATFORM HARDWARE COMPONENTS The CPU and memory are most important Hardware component of a computer system, A. practical coniputer needs additional components. Below figure shows Hardware architecture of a typical computing platform.palette Tr “Fig. Hardware architecture of atypleal computing plaform “Apps computing platform inlades several major hardware components, (CPU — provides bsie computational fies [RAM — used or program and data storage ROM — Holds the boot program and some permanent data DMA controler ~ provides direct memory acess capabilities ‘Timers ~ These reused by operting system fora varity of purposes. [A high speed bus, connected to the CPU bus through a bridge, allows fas devices to communicate eficienty withthe ret ofthe system. ‘A Low speed bus, conncied to provide an inexpensive way t0 connect sinpl ovioes and may be necessary for backward compatibility as wel. uses ‘The bus provides a common connection between all the components i ts computer: the CPU, memories, and VO devices, The bus transmits address. i) snd cont information so tht one device on the bus ean read or write anal) deve i yr drogen, ae vs patterns x pa trsFrs MY OCT ee ere 2 oto devi sc abor tenis yg eevngistevetions onthe CPU ingle Chip Platforms ‘yecan putall the above componens for sip. A single chip platform makes the deci tems much ease. Men of cen tye of erocontrolers Microcontroller ‘single chip that ncades «CPU, memory and LO dees 322. PLATFORM SOFWARE COMPONENTS Hardware and Software are inseparble, each aces the oer sevoen function. Much of the softvare in an embed gutcn wil ome Fe mrces. Some software components may som fom tind prtes ine ers provide a basic set of software plain commons to cir Hardware, These components range actos oy ieee ‘sshown in below figure. Hardware Abstraction Layer - guides abasic vel ofabsesios f Hanne. Device driver + Use dhe HAL 0 simpli dhe seats Power Management module Naschav ow lve asses Hindoate 08.8 Fite system = provides the tsi abtrstns reid © ld comple spitions.33. COMPC “Tere are nds of embedded programs. Tho Bonbon Real Fina ‘Fig. 2.2 Soar pe eagram fren Entel sem SONENTS FOR EMBEDDED PROGRAMS Tjuce most important sretures or components We e.component are, 2, Circular bier and 3, Queue 1, State machine ‘Among three components sta soch as user iteoces. Circular processing. 331, STATE MACHINE ‘When inputs gon to any kinds of systems, the reaction of most systems cat characterized i terns of the fap received andthe curent sie of the sys Tendo technique known a finite state machine style The Gait state machine ale wll describe the reactive 5 tte machine style of programming is lso an efficient implementation of eesign. fe machines are well suited 10, reactive ssn offers and queues are useful in digital si! ystems behav ad | sanan | | - ~ | Programmi Applications ray ovine 8h) [552 CRCUIAR BUFERS 480 Sinn onaagg 3] tht hanes | cen 29. BEIOW Bre Shows hw a iy "At each Pot itm, the algo, free window into the scam, crt i a 4 bret ofthe eta sean that ‘ver tim Eris the one tat needs 1 be pe blr, it wraps around to he top ‘Many Digital signal processors povide alts ter. For example, the C55x provides fe sin ese eistrs allow ctl bts ea ae wena ‘We C code ed without alignment consti = arom ee ima een ante absence of cies | an mf Te | - | ip 3.28 Acie baferrbd ad Rel Tine signal flow araph a low graphs are Hr ree wont one wpe of gi! ke Sia flow BAP rg, | nt feng Suc. ; represent many fs O 0) ao 1.826 Asif rh ts aig endo era efter opens ta spl te with ips arving f sc mat ne The nts no) an» (0) a eqns adexed By wi cresponds te sequen of samples : Tris guph noes can be either arithmetic operators or delay opr. Th aod a wipe nad poses the ouput (2). TH BOX Ine ty peter The nln represent x neo and sper! meas tcopaton etn etme dy foe spe peso. The som hy eee aoa opto abled wih by, means tat he oop! oe ty pear amp by 8. ‘The fier takes in a new sample on every sample period. The new inn becomes 2 the old bocones#y ex 8 stored drelly inthe circular bute ‘bu must be mulled by’ by before being added to the utp sm, 3.3.3, QUEUES AND PRODUCER / CONSUMER SYSTEMS Ques reals wed in signal processing and event processing. Queves sew okenever dita may anive and depart at somewhat unpredictable times or vk variate amounts of data may ave. Que also called san elaste ber. (ens canbe build jn two ways. One is sing linked list and another oi using array, Build a gusve using linked ist method ean allows the queue 1 = sas size Designing the que wih ray can hold all he date ee for example consider P2 rans ahead of Pt wit xem rs ea car Due al¥29S 6 ed mperoigaa trying numbers of ements in, comune Sten c —EFEHE-ELG | 76325 A ae Com Above fire shows a simple producericonsmer ny nn ton grimy me ae ie bre, The que edi ie ene |fpsoe da tem, PI and 2 are wo doth by queues ts ol inthe system as wl eyo fda in 2 emp inl FA ible ‘is method is easier to imple in & mliingeavionnen sada nl © mae, cesve we of ge i papa scr! ma ests. suastructuresn Queues Ing input queue. At that pointy, the queue wll xu “hin, P2 should stop working ntl mor dass The queues in a producer/eonsumer system may bod cite nfo sa deta sknents or variable sized date elements. In some esses, he casne ness to ow bow many ofa given typeof dat element are esc opt. The queue em be sryetured to hold complex data ye, The data ru ean be sre as) bso integers inthe que, 34. MODELS OF PROGRAMS [ltrs eco a a cain | fern an ti ae Use we ave diferent pes of sce code uch sey nes, CnEntec ad Real Fine Sess icy ey men {ME aon he dst pr Sem Single moter and so on, We mus we single model to desesibe al of them. Single model sap fe in the code. In i tan be perform many useful analyses onthe model more easily e in the code. In single as control / Bata Flow Graph (CDFG) ‘ControVData Flow graph is findanental model for programs. The CDFG hy, conects that model both daa operons and contrl operations. To understnd i CDFG in clear format frst we must understand data descriptions "| MF ffs OF NODES IN DATAROW ctapy 542M etwo pes of sin he gph ah ag ound nodes Fepresen Ope 0) 1 ue des represent vals ©) asa ay be er np © Be jetliner a nale-assignment code is shown in Si Mafow raph fo ove ingle-s8ian hon in below figure 328 “st 2.4.1, DATA FLOW GRAPHS 'A daa flow graph is « model ofa program with no conditions. tn # highly programming lenguage,& code segment with no condition have nly On ety ay = joes |] ig, 3.26 A Basie ck nC tn the above codex is having two assigament and it appears tie cn he Ie] side of an mlgnment. So we need to rewrite the code with single assigmest ‘Because if any variable heving 680 asignmients means it contains ony te ate sisigned value, Tae modified format for above code is = 5a Flow anh Fig, 27 Basle block in single signmen forms Fig, 3.28 Daa Pew er eresy vin die and Rel Tom ied Pret oklatand Rol Fe tit he above © ce or (a vantages of Dataiow Graph 1. onder ofexesuton operation mem | 2. reduces pene PSS — J ruse decerine este weosrings of he oS \ 3, risus "9. 03 ow GRAPHS (COFS) |sas. conmmot / DATA . eae rahe fr conacting dion COM, Cop A CDFG uses 2 datstow |naving rs pes des sDecion nedes ‘Decision noes are used to desert the coal ina sequential program, 2. pata flow nodes ‘Daten node encapsulstes # complete dat flow sraph 1 represent bas oc 1f( cond 1) ae Nk _10: era ese ‘use _loek_205 ‘asic_Hoek_3.(: vse (151) case Cl :tsio_block_4()s | Fig. 329. CDFG cole = | Inthe CDFG construction it has two kinds of pes. ver aatae 0p | Rectangular node use to represent the his lochs Ba 2 Diamond shaped noes used to represent the coaos case C3 tate block_6(05 ‘While loop consists ofboth a test and Toop bas, eso whish we ka ho | es efeseat in a CDFG — we can represcat foe ops by in Cf opis fina in ’ a lems ofa while loop. BeCi=Osi oe tetecrirxti: |tocposy a) Fi 33.a Freed aed Reol Tn 136.3 PROCEDURES veroton of procsures is the major probe in code eeneration- Gener Say ssh forward, Procedure definition ms hang cabana roca Ogee pclae peas “se mooning bag te mn Ig wc aoe hes sa peor mpl tw gon don om ihe i ees pn oe Pa 36.4, DATA STRUCTURES, ‘la structures ave the way of organizing thé data. The compiler most ss translate eofrenes o data stuctres nfo reference to raw memories. Convers tna structures require adress computations, some of these computations ib done at compile tne and thers must be doe at runtime. Peample inked fst, Ary, Queue rogram or sine i) Union Og sneresting dat structure tron time ATS hare tes i se Comptes 0 HE ATA nag YS ry imeosionsl AMY aa - sional Arrays 5. Mal ae oimensional ATTY | ine only one subscript value, C; seo Consider one dime netting ma. cots un a ee json ay shown ke i eon ae wih i ‘hs Zero stored co clement is sored theft ene 0 acc low a oon. We cn crea puter ir beusy terse |. variable it contains the address of another variable, aimmaeiaes| Amy pointer points tothe ara |r oit hayhed aelyo 0) 1 cl t pi ence, then we can rewrite the reading of [i] 35 Ee) | (wo dimensional Arrays Ieontsins 10 sibscri alse to siberpvales.To destin! arya moe ln [REE 82 tile possible ways to Layout vo dines! ayn meno. mn of memory Layout for two dimensional ays is row major.Inthe ow major the (0.9) (0.1) att.) oft) “Two dimensional arays lsorequte more sophiseted addressing. First we my now tbe sie of array. Tn rowmaor form ifthe a (] aay is of size Nx M4, th sve can farm the fo dimensional aray acess ito a one dimensional an abl becomes alms) Where the maximum valu for is M— 24.5. COMPILER OFTIMIZATIONS. | ‘Basic compilation techniques ean generate ineizient code. Compiles ws wide range of algorithms to optimize the code they generate Loop Transformations ‘Loops are important program structure because they are compactly desi the source code and they use a large faction’ of the computation ime. Mor] techniques have been designed to optimize loops. L009 Uarating ‘A simple and useful transformation i kiown as loop unrolling, I is impor because it helps t expose parallelism that can be used by later stages oft compile. é Finer vrabe of be aay Jin [7] Yves most qui Brent Ret Tan irogronmi - ag jon 20 gio 18 OES USI 0 combing yy rantormation 1 Bogs, condgn a HEP ita Foes mst ileale OF emt sat ‘elo sae values Joop Bodies MUS RO have dependen Toexecuted togetier pistbution peor sition isthe OPPOSt of lop fon dat on nt mule OP tooo THRE Loop tiling process breaks wp sloop int a seo eneoming tt OPEaON on asst oe a Bein OEENIED fess eNee feG=Os NID for 05j eld all] * 80 feimiticmingsaneien | 6S ha woul be il 5, decomposing sing este leaps, wth eth ane fad f=: <4 +2.Npy + slalmatagal seen, ed ee atadfaa ann Bo | = Ls || Fig 338 Loo ingTribe at Rel Fina (te “Te tone Hoops iterates within the gg yo 19ps le cirri (on Table we tion ible iS USE Lo kaep tek of cr retin shown in bow gt OMY iy py, ach lop i split nto 1 ne outer Top erates aos the ies 7 Drawback pee Te changes the order in which ter cont the Behavior ofthe cc ary ete 8 S638, $0 loving | he ovafits during loop execution, ead Code elimination ott can never te executed Dead code éan be weer Dead code Is 60 romans ot by compiles Dead cade can be idented By reach ats Anal. vReaibilty Anais i thé process of finding the other ste am which can be vedhed. Ifa given plese of code camo serdar itcan be eathed ony by ize of cde that 8 Ueschble fom ee gon tee cnt linia. Dead ode clitnaton wil lor cade for reviabilty and rims away ded coe Fig.236 Reseron efron cay ninstruction 19Pe8 | and 2 both we resource A wit men stsnce B. In that tble Row reese NOE aa ee srepreent souices thal mt be siedann cre mheduling an instrcion tobe executed ta Ie mention table t detetrine whether al rene i abet tha ine pn cig ie ntrin we ee ew a ese by tat instruction. Reservation table provides a good summary of the snc ar inseaction scheduling problem in progress. ee Panicular time, we ets Register Allocation eee by the issn Regie allocation isa very important compilation phase, Fr a block of ens, ye want to choose assignments of variables, (both declared and temporay) i] registers fo maize th otal mmber of required registers Ia seton of code requires more registers than are available means we ms spl some ofthe values ou to memory temporarily. After computing some vais ‘rewrite the values to temporary memory Joeations, reuse those registers in ot computations and then eeead the old values from the tenporary locaton ste of) sonar Pipelining | ‘Software pipelining is a technique for reordering instructions across several | itrtions to reduce pipeline bubbles, vee resume work Scheduling Insrution Selection Scheduling is the prosess of selecting a processes from the ready slate b| runing ssc. Every CPU manufacturers generally disclose enough informa shout micro architecture t allow us to schedule instutions even when they do not provide detailed description of the CPUs intemal, Selecting the instructions 0 use to inp ions to uso to implement each operation axe not vil "hee may be several different instructions that ea be wed t accomplish the ane bs but they may have different execution tines. Using on® insinetion for ons [ofthe program may affect the instructions that can be sed in ajeent code(Go7, PROGRAMM LEVEL PERFORMANCE ANALYSIS - ash io endersundioa performance or | “Fie CPU pearance i me ipo = syoeon Bat he CPO performs, spot judged io the SAME WEY 83 Prop | Pfomance, CPU clock rate is 2 very uarelisble mec fer prorem peformsie, Paar taponary, he CPU executes pr of Ou program QuCKIY oes", mean thot wil erecte the ep program att rate we desire “The exeuton time ofa program vies with be input date values Because thse luc selec ifr execution pas nthe program. For example loops may rt branches may execute blocks eceuted a vying numberof ies and dies saying compli The CPU pipline and cache act as windows into our program. The cache hes najor effect on program performance and cache's behavior depends in parton tg data vals ft to the progam. The excuton time ofan instruction in a pipeline depends not only on ty Inston but on the isrtions around iin the pipeline. @ om oI ig. 3.5% Execution Time lel property of program | seasuring Execution Speed ‘We can mcasre program performance in sever way suc | 1. simastor =| mgt P28 298 WC Fig or along With input data ang s Sea 3 wih nt ed rs ee fore] in ae tHe rogram on ot az, fe ta Bus cea be used tom A of executing sections of cove 0 The length o rh ofthe program ees ited by the accuracy ofthe incr, fen weeAnaiece log anlyer ca Be EADS ts icropaceser [ed sop times of & code Segment. The length af cole fee ed byte sizeof the logic analyzes ber “ [pet of Performance Measures | can tl so sa eer “Tera three diferent ypes of performance Mea f Measures on progam sich Iuaverage Case Execution Time ‘Tis isthe typical execution time we would ex lenge defining typeal inputs, {,Worst-Case Execution Time ‘The longest time that dhe program canbe speadon ay i ponant for systems that most meet deadlines. $.80st-Case Execution Time Pet fr pial dts. The ft inp sequence is lely Talsmeasure ean be important in multiss Real tine ems 1471, ELEMENTS OF PROGRAM PERFORMANCE ‘The Program execution time canbe seen.as Eiceuion Time = Pryee rr gam TACs ia roe cess of combining the determination op 10 pat pa Traces can Be valuable for aye Sen pa ih He fr cher popones i Dh de sit 2 pavir ofthe program nS cha ees bo pro sahaing | aie dd Fp {isa0) i Program Path The pall Js he sequence of istuctions executed bY AME OGM oy i ive inthe igh ewe language representation of he program. ‘econ nase the exceution path oF a program through it Bih Jove langug rant hard to get accurate estimates of foal exceation time rom pease there i nt a dieeteorrespondn| ment issues aso’ eave to deters he ap aay 1 pore © MEBs he POEs toma eespoed the CPU or its simulator. spit igh evel Ianguoge program. This | tween program statements and nstevetions. Gi) istevtion Ting ion ting is determined based on the sequence of instructions Parle ype oF Cu, “Te insteve woe by the peogram path which fakes into aecount data dependencies, pipet pnneios | behavior and eahing ‘ase ee eas | ‘nee we know the excetion path of the program, we have 10 measure thy Saas ts action time ofthe instretions exceed along that pth ~ for that sssume eve, 1, Diet | ‘ametion takes the same nomber of clock eyeles (ie) we need only count te 2. By using a simulator. nauetions and enuipy by the per Instruction exseution time to obisin te roi i otal execution time. The technique is simplistic for the following |) Profiling pring is 9 simple method for ansysng softwuepeomane_ A po Les not measure execution ime instead it counts the unter of en ensues or basic blocks inthe progamareeseated nn rogram 1. Not alnsirstions take the same amount of ime 2. Exeeution times of instructions are ot independent 3. The excesion ine of an insruetion may depend on operand values. soontace { 4 We can modify the executable progam by siting inswsions tt | increment a location every time the powem pes rogram. 4 Weean sample the program counter dn extton edi + ung execution and xp ek of te Aiseibution ofthe PC caching eects he acess time for main memory can be 10-100 tim | seces time, cathing can hive huge effets on insiction execution CGanging both the insruction end ata access times. Ceching. performance ‘ahereny depends on the programs execution path beeause the cache's contr he history of weesss es larger than the cache ime by ‘hat point in | 2:72, MEASUREMENT DRIVEN PERFORMANCE ANALYSIS. Meesuresert is he must commonly used way to determine the execution tit esti voltae, Wort ~ ease exons (Phe! performance tdeasurement | Pysical measurement requires some son of Hardrare in [Bt iret method of measuring the performace of progen vo | 82 progzam counters value, Start«infomation about he program aes ets, ean we the o measure the EXESUHON ing doesnt give ws di Af have sever tn Afro pats ofthe progr (simulation Based Performance Measurement “the alematve to physical measurement of exceation dine is simulation. py) simulator irs progam that takes as input a memory image for a CPU and prfann| te opeaons om tat memory image that te actual CPU would perfonm, evi the ess inthe modified memory nage, (vn Cycle Accurate Simulator “The mos important pe of CPU simulator isthe eyele Accurate simulato which pefomns suicides simulation ofthe processor's internals that i can detemine the exact numberof elock eles required for execution. ‘A eele accurate simulator is built with douiled simulation knowicdg of how the procesor works, 0 that it ean be ike into account all the possible behaviors the mice architecture hat may atest excetion time, “A.eee acurate simulator has a complete model ofthe processor, including the cache, I can provide valuable information about wiythe program runs t0 slowly. (in inetruction Level Smuator ‘A simulator hat inetionlly simulates instruc informtion known as an instruction level simulator. ns but doesnot provide tiie 3.8. SOFTWARE PERFORMANCE OPTIMIZATION etwas pines nb opin wig sve ilo ah | anc pions 38., LooP OPTMIZATIONS Loop are important tages for opimiztin because programs with Tops tn to peed alot of time executing these loops. There are tree jmportant techniques in optimizing oops Tironrenn (ps — oe mation 1 sion variable clinton fn eduction. 5, srenat oe motion dapat cam MOV MEE hep Hang depend nope pero i te phy. er aco ee ve ou sally leamle Fig. 3.38. Coe motion it op | lec(i= Osi