Prev | CPSC 418 Home Page | Next |
CPSC 418 - Assignment #2 Solution 1. a) structural hazard, stall the instruction fetch. b) RAW on r2, bypass output of ALU to input of ALU. c) WAR on r3, early reads (ID) and late writes (WB) means no action needed. d) RAW on r2, stall one cycle (the compiler should have filled this delay slot with an independent instruction) and then bypass output of MEM to input of ALU. e) WAW on r2, in order writes means no action needed. f) structural hazard When ADDI is in ALU stage, BREQ is in the ID stage which also requires use of an ALU to perform subtraction and comparison. MIPS adds a comparator to the ID stage. g) control (branch) hazard MIPS uses delayed branches -- the compiler must find an instruction to put after the branch (possibly a NOP) which can be executed regardless of the branch decision. h) structural hazard If the PC is to be updated in the ID stage instead of the ALU stage, an additional adder is needed to sum the immediate and the PC (so a comparator and an address adder are needed in ID). -------------------------------------------------------------------------- 2. Inst. # 1 multf r1, r2, r2 2 lf r3, 64(r12) 3 subf r4, r2, r5 4 multf r6, r1, r1 5 subf r7, r6, r5 6 lf r6, 80(r12) 7 beqz r8, GOOD ; assume this branch is NOT taken (ie r8 != 0) 8 lf r6, 96(r12) 9 GOOD: addi r12, r12, #256 10 addf r9, r6, r6 a) RAW on r1 between instruction 1 and 4. WAR on r12 between instruction 2 and 9. WAW on r6 between instruction 4 and 6. WAW on r6 between instruction 4 and 8. RAW on r6 between instruction 4 and 10. RAW on r7 between instruction 5 and 8. WAW on r6 between instruction 6 and 8. RAW on r6 between instruction 6 and 10. WAR on r12 between instruction 6 and 9. RAW on r6 between instruction 8 and 10. b) i) structural hazard, i.e., 2 instructions may need access to register file in WB stage on the same cycle. ii) WAW hazards. iii) Big gaps are caused by in-order completion restriction 1 2 3 4 5 6 7 8 9 10 11 12 13 14 multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID - - - ALU MEM WB subf IF - - - ID E+1 E+2 E+3 WB multf IF ID E*1 E*2 E*3 E*4 E*5 WB multf IF ID E*1 E*2 E*3 E*4 E*5 lf IF ID - - - ALU beqz IF - - - ID subf IF addi addf 15 16 17 18 19 20 21 22 23 24 25 multf lf subf multf multf WB lf MEM WB beqz ALU MEM WB subf ID E+1 E+2 E+3 WB addi IF ID - ALU MEM WB addf IF - ID E+1 E+2 E+3 WB c) Big gaps remain because of in-order issue restriction (instructions cannot pass one another, so one instruction stalling will block subsequent instructions). The only cycle we save is the very first lf. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB subf IF ID - E+1 E+2 E+3 WB multf IF - ID - E*1 E*2 E*3 E*4 E*5 WB multf IF - ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID - - - ALU MEM beqz IF - - - ID ALU subf IF ID addi IF addf 15 16 17 18 19 20 21 22 23 24 25 multf lf subf multf multf lf WB beqz MEM WB subf E+1 E+2 E+3 WB addi ID - ALU MEM WB addf IF - ID E+1 E+2 E+3 WB d) i) 1 multf s1b, s2a, s2a 2 lf s3b, 64(s12a) 3 subf s4b, s2a, s5a 4 multf s6b, s1b, s1b 5 multf s7b, s5a, s2a 6 lf s6a, 80(s12a) 7 beqz s8b, GOOD 8 subf s6b, s5a, s7b 9 addi s12b, s12a, #256 10 addf s9b, s6b, s6b ii) RAW on s1a between instruction 1 and 4. - operand r1 is not available so hazard can't be removed. RAW on s7a between instruction 5 and 8. - operand r7 is not available so hazard can't be removed. RAW on s6a between instruction 8 and 10. - operand r6 is not available so hazard can't be removed. iii) In-order issue is still causing gaps. The cycle is saved by letting the second lf complete early (since it no longer has a WAW hazard with the first multf) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB subf IF ID - E+1 E+2 E+3 WB multf IF - ID - E*1 E*2 E*3 E*4 E*5 WB multf IF - ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB beqz IF ID - - ALU MEM subf IF - - ID E+1 addi IF ID addf IF 15 16 17 18 19 20 21 22 23 24 25 multf lf subf multf multf lf beqz WB subf E+2 E+3 WB addi - ALU MEM WB addf - ID E+1 E+2 E+3 WB iv) r1 -> s1b r2 -> s2a r3 -> s3b r4 -> s4b r5 -> s5a r6 -> s6b r7 -> s7b r8 -> s8b r9 -> s9b r12 -> s12b e) i) No, given an unlimited number of shadow registers will not remove RAW hazards. There also aren't enough pairs of instructions where the destination register of one instruction is to the same register that is source register of a following instruction to make use of more shadow registers. ii) RAW hazards remain. f) Out-of-order issue saves two cycles -- the second multf actually completes ahead of the first, and therefore the second subf can complete early and allow the addf to finish early. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB subf IF ID - E+1 E+2 E+3 WB multf IF ID - - E*1 E*2 E*3 E*4 E*5 WB multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB beqz IF ID ALU MEM WB subf IF ID - - E+1 E+2 E+3 addi IF ID - ALU MEM WB addf IF ID - - - 15 16 17 18 19 20 21 22 23 24 25 multf lf subf multf multf lf beqz subf WB addi addf E+1 E+2 E+3 WB g) An interrupt on cycles 7, 8, 11, 12, 13, 15 would detect an imprecise state (note that the WBs on 8, 13 and 15 won't complete if there is an interrupt on that cycle). h) Now we can really scream. Two WB stages allow multiple instructions to finish at the same time (which happens 3 times). The minimum path throught the sequence is still second multf -> second subf -> addf. A smarter compiler would place the second multf earlier. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB subf IF ID E+1 E+2 E+3 WB multf IF ID - - - - E*1 E*2 E*3 E*4 E*5 WB multf IF ID E*1 E*2 E*3 E*4 E*5 WB lf IF ID ALU MEM WB beqz IF ID ALU MEM WB subf IF ID - - - - E+1 E+2 E+3 WB addi IF ID ALU MEM WB addf IF ID - - - - - - E+1 E+2 15 16 17 18 19 20 21 22 23 24 25 multf lf subf multf multf lf beqz subf addi addf E+3 WB 3. a) INSTRUCTION STATUS: Instruction Dispatch Issue Execute Commit loadf X X X X loadf X X X X multf X X addf X divf X X subf X RESERVATION STATIONS: Name Busy Op source1 data1 valid1 source2 data2 valid2 FP Add1 X addf f4 (f4) X rob1 FP Add2 X subf old f7 (old f7) X f5 (f5) X FP Add3 FP Mult1 X multf old rob0 X f4 (f4) X FP Mult2 X divf old f7 (old f7) X f5 (f5) X REGISTER FILE: (either a register is valid, or it points to a rob entry) Register Valid ROB Entry f3 rob1 f4 rob3 f5 X f6 rob2 f7 rob0 REORDER BUFFER: (source is a reservation station) head: rob1 (next instruction to commit) tail: rob1 (slot for next instruction to dispatch) Entry Busy Valid Source rob0 X (add2) rob1 X (mult1) rob2 X (add1) rob3 X (mult2) NOTES: 1) the ROB is full and so no further instructions can dispatch until the multf finishes. 2) the reservation stations have physically loaded the "old" data values into their own reservation station registers, so rob0 and f7 can be safely overwritten. ------------------------------------------------- b) INSTRUCTION STATUS: Instruction Dispatch Issue Execute Commit loadf X X X X loadf X X X X multf X X addf X divf X X subf X X X RESERVATION STATIONS: Name Busy Op source1 data1 valid1 source2 data2 valid2 FP Add1 X addf f4 (f4) X rob1 FP Add2 FP Add3 FP Mult1 X multf old rob0 X f4 (f4) X FP Mult2 X divf old f7 (old f7) X f5 (f5) X REGISTER FILE: (either a register is valid, or it points to a rob entry) Register Valid ROB Entry f3 rob1 f4 rob3 f5 X f6 rob2 f7 rob0 REORDER BUFFER: (source is a reservation station) head: rob1 (next instruction to commit) tail: rob1 (slot for next instruction to dispatch) Entry Busy Valid Source rob0 X X (old add2) rob1 X (mult1) rob2 X (add1) rob3 X (mult2) NOTES: 1) rob0 has loaded the result of the subf into its own result register, so the reservation station add2 can be safely overwritten. 2) Instruction dispatch is still blocked because of the full ROB. ------------------------------------------------- c) INSTRUCTION STATUS: Instruction Dispatch Issue Execute Commit loadf X X X X loadf X X X X multf X X X X addf X X X X divf X X subf X X X RESERVATION STATIONS: Name Busy Op source1 data1 valid1 source2 data2 valid2 FP Add1 FP Add2 FP Add3 FP Mult1 FP Mult2 X divf old f7 (old f7) X f5 (f5) X REGISTER FILE: (either a register is valid, or it points to a rob entry) Register Valid ROB Entry f3 X f4 rob3 f5 X f6 X f7 rob0 REORDER BUFFER: (source is a reservation station) head: rob3 (next instruction to commit) tail: rob1 (slot for next instruction to dispatch) Entry Busy Valid Source rob0 X X (old add2) rob1 rob2 rob3 X (mult2) NOTES: 1) Up to two more instructions could now dispatch, since there are two slots available in the ROB. 2) The subf has still not written back its result to the register file because the register file must remain correct with respect to interrupts. Until the divf completes (since it comes before the subf in program order), the subf result will be stuck in the ROB.
Prev | CPSC 418 Home Page | Next |