0% found this document useful (0 votes)
2 views

Tutorial09 Cache Performance (1)

The document outlines a tutorial for CS230 focusing on digital logic design and computer architecture, covering cache performance, effective CPI calculations, and unusual cache sizes. It includes specific problems regarding processor configurations, memory access latencies, and cache organization. The tutorial emphasizes practical computations and comparisons between different cache setups and their performance metrics.

Uploaded by

Varsha Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Tutorial09 Cache Performance (1)

The document outlines a tutorial for CS230 focusing on digital logic design and computer architecture, covering cache performance, effective CPI calculations, and unusual cache sizes. It includes specific problems regarding processor configurations, memory access latencies, and cache organization. The tutorial emphasizes practical computations and comparisons between different cache setups and their performance metrics.

Uploaded by

Varsha Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CS230: Digital Logic Design and Computer Architecture

Tutorial 09, [Mon 21 Oct, Tue 22 Oct, Thu 24 Oct]


1. (Based on Q7.15 from the textbook). Suppose a processor with a 16-word block size has an effective
miss rate per instruction of 0.25%. Assume that the CPI without cache misses is 2.5. The DRAM
(main memory) latencies are: 2 cycles for communicating the address, 20 cycles for the access latency,
and 2 cycles for communicating the data read. Compute the effective CPI under the following scenarios:
(a) 1-word wide memory, with and without interleaving
(b) 2-word wide memory, with and without interleaving
(c) 4-word wide memory, with and without interleaving
(d) 8-word wide memory, with and without interleaving
2. Cache performance tradeoffs (based on Q7.32)
Three processors P1, P2, and P3 are the same, except for their cache configurations.
P1 has a direct-mapped cache with 1-word blocks
P2 has a direct-mapped cache with 4-word blocks
P3 has a 2-way set associative cache with 4-word blocks
The measured miss-rates for the 3 processors, for a particular benchmark program, are:
P1: instruction miss rate = 4%, data miss rate = 6%
P2: instruction miss rate = 2%, data miss rate = 4%
P3: instruction miss rate = 2%, data miss rate = 3%
The benchmark program is such that half the executed instructions have a data memory access. Assume
that the cache miss penalty is 6 + block size in words. P1’s CPI is measured as 2.0.
(a) Compute the ideal CPI, i.e. assuming a perfect cache with 0% miss rate.
(b) Compute the CPI of P2 and P3, and determine which of the 3 processors is the fastest.

1
3. Unusual cache size (based on Q7.30)
Consider a cache of size 3K words of data (note: 1K = 210 here). Answer the following, with appro-
priate explanation.

(a) Is it possible to organize it as a fully associative cache? As a 2-way set associative cache? As a
direct-mapped cache? Assume that the cache controller has to work with bit extractions alone, on
the given memory address, and no other complex computations.
(b) For each of the above cases in (a) which is possible, indicate the maximum possible block size.
(c) For each of the above cases in (a) which is possible, and for the largest possible block size as
computed in (b), show the various memory address fields, when the memory addresses are 32-bit
long.

You might also like