Prev | Home | Next |
Suggestion Box |
Parts (a) and (b) both refer to an instruction cache with:
- word addressing
- 64 words
- 4 words per block
- least-recently-used replacement scheme
Use a picture and (words and/or pseudo-code) to describe how the above cache with two-way set associativivity detects a cache miss and handles the subsequent fill.Postscript source for figures
A W-word cache with w words per block (or line) has l = W div w cache lines.
A cache with L lines and A way set-associativity has S = L div A sets with A lines per set.
W | words per cache | 64 | ||
---|---|---|---|---|
w | words per block | 4 | ||
A | associativity | 2 | ||
L | number of cache lines | W div w | 64 div 4 | 16 |
S | number of sets | L div A | 16 div 2 | 8 |
AddrSize | total width of address | |||
WIdxSize | word index | log2 w bits | log2 4 | 2 bits |
SIdxSize | set index | log2 S bits | log2 8 | 3 bits |
TagSize | tag size | AddrSize-(WIdxSize+IdxSize) bits | AddrSize-(2+3) |
If one of the tags in the set matches the address tag and the valid bit for that line is true, then there is a hit, otherwise we get a cache-miss.
When we get a cache miss, we replace one of the cache lines in the set with the cache line needed by the address. With a least-recently-used replacement scheme and 2-way set-associativity, we can determine which line is to be replaced simply by marking the opposite line as least-recently-used (the u bit in the figure) whenever a line is accessed.
With an instruction cache, we never write to a cache line, so when a cache line is replaced, we just load in the new tag and words.Use a picture and (words and/or pseudo-code) to describe how the above cache with four-way set associativivity detects a cache miss and handles the subsequent fill.Postscript source for figures
A 4-way set-associative cache differs from a 2-way set-associative cache, in that there are 4 cache lines per set. This means that 4 tag comparisons must be made to detect cache hit/miss.
Determining which cache line in a set was least-recently used is more complicated for 4-way set-associative caches than for 2-way. The LRU bit for each line is replaced with a 2-bit counter. Rather than setting a single bit each time a cache line is accessed, we increment the counters for all of the other lines in the set when a line is accessed.
The least-recently-used line is the one with the largest value in its counter.
Given a two-level memory system (split instruction/data L1-caches and main memory):L1 data-cache:
Main memory:
- 95% hit rate
- 30% of cache accesses are writes
- at any point in time, 25% of blocks in cache have been modified
- 8 words per block
- 1 cycle access time
- Transfer rate between registers and L1-cache: 1 word/cycle
- not-last-used replacement scheme
- 20 cycle access time
- Transfer rate between L1-cache and main memory: 8 words/cycle
Calculate the average data access time if the cache uses write-through with no-write-allocate on write miss.From Handout 12:
Calculate the average access time if the cache uses write-back with write-allocate on write miss.
T_WB1 =
= T_Acc2 + T_Xfr_1_2
= 20 + 1
= 21
T_Avg =
= 0.95(1 + 0.30*1) + 0.05*(0.25*21 + 22)
= 2.6 cycles
What are the tradeoffs in using virtual addresses or physical addresses for caches?
Would you use virtual or physical addresses for a cache? Justify your answer.For data-caches, physically addressed caches are generally preferable. A relatively small size TLB (64 - 1024 entries) is sufficient for even large caches. In comparison, the aliasing problems associated with virtually addressed caches do not scale well as cache sizes increase, because flushing a cache becomes prohibitively expensive and the number of comparisons that must be made to prevent aliasing grows linearily with the cache size.
For instruction-caches, virtual addressing may be preferable. This is because aliasing not much of a problem with instruction caches.
Prev | Home | Next |
Suggestion Box |