CA Assignment
CA Assignment
Explanation:
In this instruction we have to add the content of R2 and R3 and save the result in
R1 address same goes with 2nd instruction. We have to read the content of R2 and
R3
In 1st instruction will be in fetch then buffer 1 will have instruction 1. In decode
2nd clock cycle step it will generate the controlling signals based on input and R1
will be read, in buffer 2 content of R2 and R3 , address of R1 will be read. Since we
know that Execution is nothing but ALU so suppose R2=20 and R3=30 their sum
will be in execution buffer which is 50 also along with R1 address. Memory stage
nothing has to do on this stage it will simply forward it to write back that will
simply assign 50 to &R1. The process with ADD R1, R2, and R3 has been
completed.
Now let’s talk about SUB R4,R5,R1 after 2nd clock cycle start process of this
instruction2 will fetch in buffer, after 3rd it will decode and so on then there will
be problem occur after 3rd clock cycle, in 1st instruction R1 has updated after
memory write back but in 3rd clock cycle value of R1 initially upper is 0 so
subtraction will not perform and we have to stop pipelining at that stage if we
start this process after 1st wb the 3 slots will go empty and space and time will
waste here occur Data Hazard. Now how to deal with it?
4|Page
In memory stage nothing will happen. In WB 154 is the effective address with
write back to execution buffer and update PC and next instruction is on EA 154
With Pipelining:
Fetch the instruction 1 the PC is 100 initially on execution stage if R1 and R2 are
not equal to each other then we will follow the regular pipelining process which is
perfectly fine but what if it is false? If R2 and R1 are equal EA=154 then execute
104, 108 is wastage of resources and time. We then identify the branch on
execution stage if it’s unsuccessful its fine but if it’s successful problem will occur
(ii) Early Branch Detection: in this method we realize it on 2nd stage which
is decoding stage and optimize the stall to 1 instead of 2 because
sometimes execution is very fast. We use extra circulatory component
to predict the comparison on decode stage instead of execution stage.
You will realized that which instruction to be fetch next
One after another we are doing stages with each instruction, fetch decode,
execute and write back result in memory but if we notice on 4th clock cycle is
needed for decoding and fetching the operand this phase is needed for 2 things
R1 is needed for fetching the operand and storing the result in same clock pulse
here inconsistency arises which is structural Hazard. What to do now?
(ii) Duplicate: Add more hardware to design so that each instruction will
separately access their resources