0% found this document useful (0 votes)
106 views

Exercises Chap 2

This document contains several examples exploring computer architecture concepts like instruction processing rates, CPI, parallelization, and Amdahl's law. It includes examples calculating frame buffer sizes, wafer yields, processor performance comparisons, compiler effects, parallel speedups, and ways to reduce execution times. The examples provide calculations and analysis related to key metrics like clock rate, CPI, instructions per second, execution time, and speedup.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views

Exercises Chap 2

This document contains several examples exploring computer architecture concepts like instruction processing rates, CPI, parallelization, and Amdahl's law. It includes examples calculating frame buffer sizes, wafer yields, processor performance comparisons, compiler effects, parallel speedups, and ways to reduce execution times. The examples provide calculations and analysis related to key metrics like clock rate, CPI, instructions per second, execution time, and speedup.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Ex. chap.

2 “Computer abstractions and technology”

Ex. 1.4: Assume a color display using 8 bits for each of the primary colors (red, green, blue) per
pixel and a frame size of 1280 × 1024.
a. What is the minimum size in bytes of the frame buffer to store a frame?
b. How long would it take, at a minimum, for the frame to be sent over a 100 Mbit/s network?

Ex. 1.10: Assume a 15 cm diameter wafer has a cost of 12, contains 84 dies, and has 0.020
defects/cm2. Assume a 20 cm diameter wafer has a cost of 15, contains 100 dies, and has 0.031
defects/cm2.
1.10.1 Find the yield for both wafers.
1.10.2 Find the cost per die for both wafers.
1.10.3 If the number of dies per wafer is increased by 10% and the defects per area unit
increases by 15%, find the die area and yield.
1.10.4 Assume a fabrication process improves the yield from 0.92 to 0.95. Find the defects per
area unit for each version of the technology given a die area of 200 mm2.

Ex.1.5
Consider three different processors P1, P2, and P3 executing the same instruction set. P1 has a
3 GHz clock rate and a CPI of 1.5. P2 has a 2.5 GHz clock rate and a CPI of 1.0. P3 has a 4.0 GHz
clock rate and has a CPI of 2.2.
a. Which processor has the highest performance expressed in instructions per second?
b. If the processors each execute a program in 10 seconds, find the number of cycles and the
number of instructions.
c. We are trying to reduce the execution time by 30% but this leads to an increase of 20% in the
CPI. What clock rate should we have to get this time reduction?

Ex 1.6
Consider two different implementations of the same instruction set architecture. The
instructions can be divided into four classes according to their CPI (class A, B, C, and D). P1 with
a clock rate of 2.5 GHz and CPIs of 1, 2, 3, and 3, and P2 with a clock rate of 3 GHz and CPIs of 2,
2, 2, and 2.
Given a program with a dynamic instruction count of 1.0E6 instructions divided into classes as
follows: 10% class A, 20% class B, 50% class C, and 20% class D.
a. Which implementation is faster?
b. What is the global CPI for each implementation?
c. Find the clock cycles required in both cases.
Ex. 1.7
Compilers can have a profound impact on the performance of an application. Assume that for a
program, compiler A results in a dynamic instruction count of 1.0E9 and has an execution time
of 1.1 s, while compiler B results in a dynamic instruction count of 1.2E9 and an execution time
of 1.5 s.
a. Find the average CPI for each program given that the processor has a clock cycle time of 1 ns.
b. Assume the compiled programs run on two different processors. If the execution times on
the two processors are the same, how much faster is the clock of the processor running
compiler A’s code versus the clock of the processor running compiler B’s code?
c. A new compiler is developed that uses only 6.0E8 instructions and has an average CPI of 1.1.
What is the speedup of using this new compiler versus using compiler A or B on the original
processor?

Ex. 1.12
Consider the following two processors: P1 has a clock rate of 4 GHz, average CPI of 0.9, and
requires the execution of 5.0E9 instructions; P2 has a clock rate of 3 GHz, an average CPI of
0.75, and requires the execution of 1.0E9 instructions.
a- One usual fallacy is to consider the computer with the largest clock rate as having the largest
performance. Check if this is true for P1 and P2.
b- Another fallacy is to consider that the processor executing the largest number of instructions
will need a larger CPU time. Considering that processor P1 is executing a sequence of 1.0E9
instructions and that the CPI of processors P1 and P2 do not change, determine the number of
instructions that P2 can execute in the same time that P1 needs to execute 1.0E9 instructions.
c- A common fallacy is to use MIPS (millions of instructions per second) to compare the
performance of two different processors, and consider that the processor with the largest MIPS
has the largest performance.
Check if this is true for P1 and P2.
d- Another common performance figure is MFLOPS (millions of floating-point operations per
second), defined as MFLOPS = No. FP operations / (execution time × 1E6) but this figure has the
same problems as MIPS. Assume that 40% of the instructions executed on both P1 and P2 are
floating-point instructions. Find the MFLOPS figures for the programs.
Ex. 1.14
Assume a program requires the execution of 50 × 10^6 FP instructions, 110 × 10^6 INT
instructions, 80 × 10^6 L/S instructions, and 16 × 10^6 branch instructions. The CPI for each
type of instruction is 1, 1, 4, and 2, respectively.
Assume that the processor has a 2 GHz clock rate.
1.14.1 By how much must we improve the CPI of FP instructions if we want the program to run
two times faster?
1.14.2 By how much must we improve the CPI of L/S instructions if we want the program to run
two times faster?
1.14.3 By how much is the execution time of the program improved if the CPI of INT and FP
instructions is reduced by 40% and the CPI of L/S and Branch is reduced by 30%?

Ex. 1.9
Assume for arithmetic, load/store, and branch instructions, a processor has CPIs of 1, 12, and 5,
respectively. Also assume that on a single processor a program requires the execution of 2.56E9
arithmetic instructions, 1.28E9 load/store instructions, and 256 million branch instructions.
Assume that each processor has a 2 GHz clock frequency.
Assume that, as the program is parallelized to run over multiple cores, the number of
arithmetic and load/store instructions per processor is divided by 0.7 x p (where p is the
number of processors) but the number of branch instructions per processor remains the same.
1.9.1 Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the
relative speedup of the 2, 4, and 8 processor result relative to the single processor result.
1.9.2 If the CPI of the arithmetic instructions was doubled, what would the impact be on the
execution time of the program on 1, 2, 4, or 8 processors?
1.9.3 To what should the CPI of load/store instructions be reduced in order for a single
processor to match the performance of four processors using the original CPI values?

Extra exercise (Amdahl’s law):


A machine executes a program consisting of 60% of addition operations and 40% of divide
operations. It is considered that both operations have the same CPI. The original execution time
is of 100s.
a) What is the execution time after improvement if the divide operations can run 5 times
faster?
b) What is the speedup of the improved machine relative to the original machine?

Hint: We remind you about the Amdahl’s law formula:


Execution time after improvement = (Execution time affected by improvement)/(Amount of
Improvement) + Execution time unaffected

You might also like