0% found this document useful (0 votes)
298 views

Homework 1 - Computer Architecture - HCMIU

- Students with even codes will do problems 2, 4, 6, 8. Students with odd codes will do problems 1, 3, 5, 7. All students must do problems 9 and 10. - The homework is due on March 07, 2021. Assignments with duplicated work will be scored 0. - The document provides descriptions of 10 problems related to computer architecture topics like instruction mix, CPI, IPC, pipelining, and Amdahl's law.

Uploaded by

Joe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
298 views

Homework 1 - Computer Architecture - HCMIU

- Students with even codes will do problems 2, 4, 6, 8. Students with odd codes will do problems 1, 3, 5, 7. All students must do problems 9 and 10. - The homework is due on March 07, 2021. Assignments with duplicated work will be scored 0. - The document provides descriptions of 10 problems related to computer architecture topics like instruction mix, CPI, IPC, pipelining, and Amdahl's law.

Uploaded by

Joe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Homework 1

Implementation guide:
• Students whose code is an even number will do Problems 2, 4, 6 and 8.
• Students whose code is an odd number will do Problems 1, 3, 5 and 7.
• Problems 9 and 10 are required for all students.
Due date: Sunday, March 07, 2021
All the duplicated assignments will be scored 0.

Problem 1. When running an integer benchmark on a RISC machine, the average instruction mix was as
follows:

Instructions Average Frequency


Load 26%
Store 9%
Arithmetic 14%
Compare 13%
Cond. branch 16%
Uncond. branch 1%
Call/returns 2%
Shift 4%
Logical 9%
Misc. 6%
The following measurements of average CPI for individual instruction categories were made:

Instruction type Average CPI (clock cycles)


All ALU instructions 1
Load–store 1.4
Conditional branches:
Taken 2.0
Not taken 1.5
Jumps 1.2
Assume that 60% of the conditional branches are taken and that all instructions in the Misc. category are
ALU instructions. What are the CPI and IPC of the benchmark on this RISC machine?

Problem 2. Give an equation relating the performance measure MIPS, IPC, and the cycle time c. If from
one generation of processors to the next c decreases by 50%, how should the value of IPC change so that
MIPS is doubled? In view of the results of Exercises 3 and 4, do you think this is possible?

Problem 3. Machine M runs at 3 GHz. When running program P, its CPI =1.5.
a. How many instructions will be executed during 1 second while running program P?
b. While running program P, the mouse has to be polled 30 times per second. The polling routine
requires executing 200 instructions with a CPI of 2. What is the overhead, that is, the fraction of
time used in polling the mouse? Is it significant?
Problem 4. Consider two implementations M1 and M2 of the same ISA. We are interested in the
performances of two programs P1 and P2, which have the following instruction mixes:
Operations P1 P2
Load/store 40% 50%
ALU 50% 20%
Branches 10% 30%
The CPIs for each machine are:

Operations M1 M2
Load–store 2 2
ALU 1 2
Branches 3 2
a. Assume that the clock rate of M1 is 2 GHz. What should be the clock rate of M2 so that both
machines have the same execution time for P1?
b. Assume now that both machines have the same clock rate and that P1 and P2 execute the same
number of instructions. Which machine is faster for a workload consisting of equal runs of P1 and
P2?
c. Find a workload (using only P1 and P2) that makes M1 and M2 have the same performance when
they have the same clock rate
Problem 5. A processor M-5 has a five-stage pipeline and a clock cycle time of 10 ns. A newer
implementation of the same instruction set in a processor M-7 uses a seven-stage pipeline and a cycle time
of 7.5 ns.
a. Define the maximum throughput of a pipeline. Which of M-5 and M-7 has better maximum
throughput?
b. Consider now a loop of five instructions (four arithmetic instructions and a branch to the beginning
of the loop). There is a dependency between two consecutive arithmetic instructions in the loop.
This dependency induces a 1 cycle stall in M-5 and a 2 cycle stall in M-7. The branch induces a 2
cycle stall on M-5 and a 4 cycle stall on M-7. Which processor executes the loop faster in “steady
state’’ (i.e., you don’t have to consider the time it takes to start up the pipeline in the first iteration
and/or drain it in the last iteration)?
c. Assume now that the loop is unrolled once, that is, instead of n iterations of four instructions and a
branch, we have now n/2 iterations of eight instructions and a branch. Instead of one data
dependency per loop, we now have two data dependencies per unrolled loop. Which processor
executes the unrolled loop faster?
Problem 6. Construct an example whereby two systems have the same MIPS rating but one of them has
an execution time of CPU (EXCPU) smaller than the other one.

Problem 7. Illustrate Amdahl’s law in terms of speedup vs. sequential portion of program by showing the
speedup for N = 8 processors when the sequential portion of the program grows from 1% to 25%.

Problem 8. With sequential execution occurring 15% of the time:


a. What is the maximum speedup with an infinite number of processors?
b. How many processors are required to be within 20% of the maximum speedup?
c. How many processors are required to be within 2% of the maximum speedup?
Problem 9. For each of the metrics specified, indicate which of the (weighted) arithmetic, harmonic, or
geometric mean you would use:
a. An experiment where you have run the SPEC CPU2006 integer benchmark on your laptop and
noted the execution times of each program in the suite.
b. Same experiment as in (a), but now you also run the suite on a computer in your lab at the
University. How do you compare the speeds of the two machines?
c. An experiment where you have simulated 100 million instructions in each of 12 programs and
measured CPI for each simulation.
d. Same experiment as in (c), but now you report in terms of IPC.
Problem 10. It can be shown that weighted arithmetic mean (WAM) and weighted harmonic mean (WHM)
can lead to the same results if correct weights are applied. Consider the MIPS measure where MIPSi = Ii /ti
(Ii is the number of instructions for program i in millions, and ti is the execution time Exi times 106).
Consider a benchmark of n programs with instructions counts Ii and execution times ti , i = 1,..., n. Let N be
the total number of instructions, and T the total execution time for all programs.

a. Show that if the weights for the arithmetic mean are ti /T and those for the harmonic mean are Ii
/N, then the WAM and WHM of the MIPSi are the same.
b. Show that for the CPI metric a WAM weighted with Ii and a WHM weighted with ci , the number of
cycles simulated in each program, will yield the same result.
c. How would you weigh the WAM and the WHM if you were measuring IPC rather than CPI?
d. Show that for the speedup, a WAM weighted with execution time ratios in the enhanced system
and a WHM weighted with execution time ratios in the base system will yield the same result.
e. In general, given a metric A/B how should the arithmetic mean be weighted in order to give the
correct result? How should the harmonic mean be weighted?

You might also like