CPSC 418 --- Advanced Computer Architecture --- Spring 1996
Homework 12 (No Credit Review Assignment)
- This homework is for your use in studying for the final (Tue
9 April 12:00pm, in CSCI (Old CS Building) room 201).
- The assignment is not worth any points, and will not be marked.
- We will discuss the assignment in class on Tue 2 Apr, in the
review session Wed 3 Apr (12:30 -- 1:30pm in CICSR 204) and in class
on Thur 4 Apr. The solutions will be available on Tue 2 Apr.
Problem 1
This is a set of short answer questions.
- Pipelines
- definition
- how does pipelining improve the performance
of a microprocessor?
- give upper and lower bounds on the number of stages in
typical microprocessors' pipeline
- why not have fewer or more stages in a pipeline?
- Superscalar
- definition
- how does ``superscalar-ing'' improve the performance
of a microprocessor?
- give upper and lower bounds on the ``superscalar-edness' in
typical microprocessors
- why not have more or less ``superscalar-edness'' in a
microprocessor?
- Caches
- definition
- how do caches improve the performance
of a microprocessor?
- give upper and lower bounds on the number and sizes of caches in
typical microprocessors
- what design decisions must be made in designing a cache system
- why not have more or bigger caches in a
microprocessor?
- does increasing the set-associativity of a cache increase or
decrease the hit-rate? Justify your answer
- does increasing the set-associativity of a cache increase or
decrease the complexity of the hardware Justify your answer
- Register renaming
- definition
- how does register renaming improve the performance
of a microprocessor?
- what design decisions must be made in designing a register
renaming system
- Multiple / Duplicate functional units
- give an example of a typical configuration for a
microprocessor with multiple and/or duplicate functional units
- how do multiple and/or duplicate functional units improve
the performance
of a microprocessor?
- what design decisions must be made in designing a
microprocessor with duplicate and/or multiple functional units?
- Interrupts and Exceptions
- list three different causes or uses of interrupts/exceptions
- why do interrupts and exceptions complicate microprocessor design?
- briefly describe how a typical microprocessor implements interrupts
Problem 2
We talked about the affects of the Speed Demon vs Brainiac decision on
the instruction set architecture. What affects might this decision
have on other aspects of a CPU? In particular: the memory system,
interrupt handling, branch prediction and register renaming.
Problem 3
Register renaming is critical for reducing stalls due to data hazards
in ``register starved'' architectures such as the Intel x86. Can
register renaming be used to reduce the memory traffic in a function
call caused by storing parameters on the stack?
If yes, explain how. If not, explain why not and give an example of
a way to decrease the amount of time spent storing and loading
function parameters to and from the stack.
Problem 4
Two microprocessors (R, for ``rabbit'' and T for ``turtle'') from
different architectural families have the same memory hierarchy
(levels of cache, sizes, speed of memory, etc).
The clock speed for Micro-R is 1.5 times
faster than the clock in Micro-T.
For the SPEC benchmarks on Micro-T:
average memory latency | 3 cycles
|
---|
average CPI for load instruction
(including memory access time) | 2.4 cycles
|
---|
Problem 4.a
Do you have enough information to calculate the average memory latency
in clock cycles for the SPEC benchmarks running on Microprocessor R?
If so, do so. If not, explain what additional information you need.
Problem 4.b
Do you have enough information to calculate the average CPI for a load
instruction (using the SPEC benchmarks) for Microprocessor R? If so,
do so. If not, explain what additional information you need.
Problem 5
You are on the design team for a next-generation microprocessor and
are contemplating two different choices for data-caches.
- Information common to Option 1 and 2
-
- 30% of cache accesses are writes
- at any point in time, 25% of blocks in cache have been modified
- Main-memory: 20 cycle access time
- Option 1
-
- 95% hit rate
- 30% of cache accesses are writes
- at any point in time, 25% of blocks in cache have been modified
- 8 words per block
- 1 cycle access time
- Transfer rate between registers and L1-cache: 1 word/cycle
- Transfer rate between L1-cache and main memory: 8 words/cycle
- not-last-used replacement scheme
- Option 2
-
- 97% hit rate
- 16 words per block
- 1 cycle access time
- Transfer rate between registers and L1-cache: 1 word/cycle
- Transfer rate between L1-cache and main memory: 16 words/cycle
- not-last-used replacement scheme
Problem 5.a
Calculate the average data access time for each cache if the caches
use write-through with no-write-allocate on write miss. How
much faster is the faster access time?
Problem 5.b
Assuming that the CPI for loads and stores is 1 + T_Avg (from Handout 12), and using the
instruction mix and CPI information from Handout 8, how much faster will the
total performance be for the faster option?
Problem 5.c
After the initial meeting, you realize that Option 2 will delay the
release date of the microprocessor by three weeks, will this change
your decision?
Problem 6
Problem 6.a
Approximately how many transistors are on a current microprocessor?
Problem 6.b
Approximately how many transistors do you expect to see on the next
generation of microprocessors (ie ones that come out within the
next year or two)?
Problem 6.c
What factors allow the number of transistors per microprocessor
to increase, and how do the factors allow the increase?
(for example ``Better CAD tools allow designers to pack transistors
closer together. This increases the number of transistors that
can fit in a given area, and thereby increases the number of
transistors on a chip.'')
Problem 6.d
If you were a chip designer, what two things would you consider using
the additional transistors for, and how would you decide which one of the
two choices was preferable?
Problem 7
There are many other topics to study that don't appear in detail on
this homework, this includes (but is not limited to):
- interrupts
- function calls
- hardware support for operating systems
- caches
- pipelining
Last modified: 29 Mar 96