Prev CPSC 418 Home Page Next

CPSC 418: Assignment #2 Solutions

CPSC 418 - Assignment #2 Solution

1.
a) structural hazard, stall the instruction fetch.

b) RAW on r2, bypass output of ALU to input of ALU.

c) WAR on r3, early reads (ID) and late writes (WB) means no action needed.

d) RAW on r2, stall one cycle (the compiler should have filled this
	delay slot with an independent instruction) and then bypass output of
	MEM to input of ALU.

e) WAW on r2, in order writes means no action needed.

f) structural hazard
	When ADDI is in ALU stage, BREQ is in the ID stage which also
	requires use of an ALU to perform subtraction and comparison.
	MIPS adds a comparator to the ID stage.

g) control (branch) hazard
	MIPS uses delayed branches -- the compiler must find an
	instruction to put after the branch (possibly a NOP) which can be
	executed regardless of the branch decision.

h) structural hazard
	If the PC is to be updated in the ID stage instead of the
	ALU stage, an additional adder is needed to sum the immediate
	and the PC (so a comparator and an address adder are needed in ID).

--------------------------------------------------------------------------
2.
Inst. #
   1         multf   r1, r2, r2
   2         lf      r3, 64(r12)
   3         subf    r4, r2, r5
   4         multf   r6, r1, r1
   5         subf    r7, r6, r5
   6         lf      r6, 80(r12)
   7         beqz    r8, GOOD  ; assume this branch is NOT taken (ie r8 != 0)
   8         lf      r6, 96(r12)
   9   GOOD: addi    r12, r12, #256
  10         addf    r9, r6, r6

a)	RAW on r1 between instruction 1 and 4.
	WAR on r12 between instruction 2 and 9.
	WAW on r6 between instruction 4 and 6.
	WAW on r6 between instruction 4 and 8.
	RAW on r6 between instruction 4 and 10.
	RAW on r7 between instruction 5 and 8.
	WAW on r6 between instruction 6 and 8.
	RAW on r6 between instruction 6 and 10.
	WAR on r12 between instruction 6 and 9.
	RAW on r6 between instruction 8 and 10.

b) i) structural hazard, i.e., 2 instructions may need access to register
      file in WB stage on the same cycle.

  ii) WAW hazards.

 iii) Big gaps are caused by in-order completion restriction

       1   2   3    4    5    6    7    8    9    10   11   12   13   14
multf  IF  ID  E*1  E*2  E*3  E*4  E*5  WB
lf         IF  ID    -    -    -   ALU  MEM  WB
subf           IF    -    -    -   ID   E+1  E+2  E+3  WB
multf                              IF   ID   E*1  E*2  E*3  E*4  E*5  WB
multf                                   IF   ID   E*1  E*2  E*3  E*4  E*5
lf                                           IF   ID    -    -    -   ALU
beqz                                              IF    -    -    -   ID
subf                                                                  IF
addi
addf

       15   16   17   18   19   20   21   22   23   24   25
multf
lf
subf
multf
multf  WB
lf     MEM  WB
beqz   ALU  MEM  WB
subf   ID   E+1  E+2  E+3  WB
addi   IF   ID    -   ALU  MEM  WB
addf        IF    -   ID   E+1  E+2  E+3  WB


c) Big gaps remain because of in-order issue restriction (instructions
	cannot pass one another, so one instruction stalling will block
	subsequent instructions).  The only cycle we save is the very
	first lf.

       1   2   3    4    5    6    7    8    9    10   11   12   13   14
multf  IF  ID  E*1  E*2  E*3  E*4  E*5  WB
lf         IF  ID   ALU  MEM  WB
subf           IF   ID    -   E+1  E+2  E+3  WB
multf               IF    -   ID    -   E*1  E*2  E*3  E*4  E*5  WB
multf                         IF    -   ID   E*1  E*2  E*3  E*4  E*5  WB
lf                                      IF   ID    -    -    -   ALU  MEM
beqz                                         IF    -    -    -   ID   ALU
subf                                                             IF   ID
addi                                                                  IF
addf

       15   16   17   18   19   20   21   22   23   24   25
multf
lf
subf
multf
multf
lf     WB
beqz   MEM  WB
subf   E+1  E+2  E+3  WB
addi   ID    -   ALU  MEM  WB
addf   IF    -   ID   E+1  E+2  E+3  WB


d) i)  1   multf   s1b, s2a, s2a
       2   lf      s3b, 64(s12a)
       3   subf    s4b, s2a, s5a
       4   multf   s6b, s1b, s1b
       5   multf   s7b, s5a, s2a
       6   lf      s6a, 80(s12a)
       7   beqz    s8b, GOOD
       8   subf    s6b, s5a, s7b
       9   addi    s12b, s12a, #256
       10  addf    s9b, s6b, s6b

ii) RAW on s1a between instruction 1 and 4.
       - operand r1 is not available so hazard can't be removed.
    RAW on s7a between instruction 5 and 8.
       - operand r7 is not available so hazard can't be removed.
    RAW on s6a between instruction 8 and 10.
       - operand r6 is not available so hazard can't be removed.

iii) In-order issue is still causing gaps.  The cycle is saved by
	letting the second lf complete early (since it no longer has a WAW
	hazard with the first multf)

       1   2   3    4    5    6    7    8    9    10   11   12   13   14
multf  IF  ID  E*1  E*2  E*3  E*4  E*5  WB
lf         IF  ID   ALU  MEM  WB
subf           IF   ID    -   E+1  E+2  E+3  WB
multf               IF    -   ID    -   E*1  E*2  E*3  E*4  E*5  WB
multf                         IF    -   ID   E*1  E*2  E*3  E*4  E*5  WB
lf                                      IF   ID   ALU  MEM  WB
beqz                                         IF   ID    -    -   ALU  MEM
subf                                              IF    -    -   ID   E+1
addi                                                             IF   ID
addf                                                                  IF

       15   16   17   18   19   20   21   22   23   24   25
multf
lf
subf
multf
multf
lf
beqz   WB
subf   E+2  E+3  WB
addi    -   ALU  MEM  WB
addf    -   ID   E+1  E+2  E+3  WB

iv) r1 -> s1b
    r2 -> s2a
    r3 -> s3b
    r4 -> s4b
    r5 -> s5a
    r6 -> s6b
    r7 -> s7b
    r8 -> s8b
    r9 -> s9b
    r12 -> s12b


e) i) No, given an unlimited number of shadow registers will not remove
      RAW hazards.  There also aren't enough pairs of instructions where
      the destination register of one instruction is to the same register
      that is source register of a following instruction to make use of
      more shadow registers.

  ii) RAW hazards remain.


f) Out-of-order issue saves two cycles -- the second multf actually
	completes ahead of the first, and therefore the second subf can
	complete early and allow the addf to finish early.

       1   2   3    4    5    6    7    8    9    10   11   12   13   14
multf  IF  ID  E*1  E*2  E*3  E*4  E*5  WB
lf         IF  ID   ALU  MEM  WB
subf           IF   ID    -   E+1  E+2  E+3  WB
multf               IF   ID    -    -   E*1  E*2  E*3  E*4  E*5  WB
multf                    IF   ID   E*1  E*2  E*3  E*4  E*5  WB
lf                            IF   ID   ALU  MEM  WB
beqz                               IF   ID   ALU  MEM  WB
subf                                    IF   ID    -    -   E+1  E+2  E+3
addi                                         IF   ID    -   ALU  MEM  WB
addf                                              IF   ID    -    -    -

       15   16   17   18   19   20   21   22   23   24   25
multf
lf
subf
multf
multf   
lf     
beqz   
subf   WB
addi   
addf   E+1  E+2  E+3  WB


g) An interrupt on cycles 7, 8, 11, 12, 13, 15 would detect an
	imprecise state (note that the WBs on 8, 13 and 15 won't
	complete if there is an interrupt on that cycle).

h) Now we can really scream.  Two WB stages allow multiple
	instructions to finish at the same time (which happens 3
	times).  The minimum path throught the sequence is still
	second multf -> second subf -> addf.  A smarter compiler would
	place the second multf earlier.

       1   2   3    4    5    6    7    8    9    10   11   12   13   14
multf  IF  ID  E*1  E*2  E*3  E*4  E*5  WB
lf     IF  ID  ALU  MEM  WB
subf       IF  ID   E+1  E+2  E+3  WB
multf      IF  ID    -    -    -    -   E*1  E*2  E*3  E*4  E*5  WB
multf          IF   ID   E*1  E*2  E*3  E*4  E*5  WB
lf             IF   ID   ALU  MEM  WB
beqz                IF   ID   ALU  MEM  WB
subf                IF   ID    -    -    -    -   E+1  E+2  E+3  WB
addi                     IF   ID   ALU  MEM  WB
addf                     IF   ID    -    -    -    -    -    -   E+1  E+2

       15   16   17   18   19   20   21   22   23   24   25
multf
lf
subf
multf
multf
lf     
beqz   
subf     
addi   
addf   E+3  WB



3. a)

INSTRUCTION STATUS:
Instruction          Dispatch Issue  Execute  Commit
 loadf			X	X	X	X
 loadf			X	X	X	X
 multf			X	X
 addf			X
 divf			X	X
 subf			X

RESERVATION STATIONS:
Name          Busy	Op    source1	data1 valid1  source2	data2 valid2
FP Add1		X	addf	f4	(f4)	X	rob1		
FP Add2		X	subf	old f7 (old f7)	X	f5	(f5)	X
FP Add3
FP Mult1	X	multf	  old rob0 	X	f4	(f4)	X
FP Mult2	X	divf	old f7 (old f7)	X	f5	(f5)	X

REGISTER FILE: (either a register is valid, or it points to a rob entry)
Register      Valid   ROB Entry
  f3			rob1
  f4			rob3
  f5		X
  f6			rob2
  f7			rob0

REORDER BUFFER: (source is a reservation station)
head: rob1 (next instruction to commit)
tail: rob1 (slot for next instruction to dispatch)
Entry  Busy   Valid	Source
 rob0	X		(add2)
 rob1	X		(mult1)
 rob2	X		(add1)
 rob3	X		(mult2)


NOTES: 

1) the ROB is full and so no further instructions can dispatch until
the multf finishes.

2) the reservation stations have physically loaded the "old" data
values into their own reservation station registers, so rob0 and f7
can be safely overwritten.

-------------------------------------------------
b)

INSTRUCTION STATUS:
Instruction          Dispatch Issue  Execute  Commit
 loadf			X	X	X	X
 loadf			X	X	X	X
 multf			X	X	
 addf			X
 divf			X	X	
 subf			X	X	X

RESERVATION STATIONS:
Name          Busy	Op    source1	data1 valid1  source2	data2 valid2
FP Add1		X	addf	f4	(f4)	X	rob1		
FP Add2
FP Add3
FP Mult1	X	multf	  old rob0 	X	f4	(f4)	X
FP Mult2	X	divf	old f7 (old f7)	X	f5	(f5)	X

REGISTER FILE: (either a register is valid, or it points to a rob entry)
Register      Valid   ROB Entry
  f3			rob1
  f4			rob3
  f5		X
  f6			rob2
  f7			rob0

REORDER BUFFER: (source is a reservation station)
head: rob1 (next instruction to commit)
tail: rob1 (slot for next instruction to dispatch)
Entry  Busy   Valid	Source
 rob0	X	X	(old add2)
 rob1	X		(mult1)
 rob2	X		(add1)
 rob3	X		(mult2)

NOTES:

1) rob0 has loaded the result of the subf into its own result
register, so the reservation station add2 can be safely overwritten.

2) Instruction dispatch is still blocked because of the full ROB.

-------------------------------------------------
c)

INSTRUCTION STATUS:
Instruction          Dispatch Issue  Execute  Commit
 loadf			X	X	X	X
 loadf			X	X	X	X
 multf			X	X	X	X
 addf			X	X	X	X
 divf			X	X	
 subf			X	X	X

RESERVATION STATIONS:
Name          Busy	Op    source1	data1 valid1  source2	data2 valid2
FP Add1
FP Add2
FP Add3
FP Mult1
FP Mult2	X	divf	old f7 (old f7)	X	f5	(f5)	X

REGISTER FILE: (either a register is valid, or it points to a rob entry)
Register      Valid   ROB Entry
  f3		X
  f4			rob3
  f5		X
  f6		X
  f7			rob0

REORDER BUFFER: (source is a reservation station)
head: rob3 (next instruction to commit)
tail: rob1 (slot for next instruction to dispatch)
Entry  Busy   Valid	Source
 rob0	X	X	(old add2)
 rob1
 rob2
 rob3	X		(mult2)


NOTES:

1) Up to two more instructions could now dispatch, since there are two
slots available in the ROB.

2) The subf has still not written back its result to the register file
because the register file must remain correct with respect to
interrupts.  Until the divf completes (since it comes before the subf
in program order), the subf result will be stuck in the ROB.

Prev CPSC 418 Home Page Next
Last modified: March 97