0% found this document useful (0 votes)
142 views

Computer Architecture and Organization Ch#2 Examples

This document summarizes key concepts in computer architecture and organization. It provides examples of measuring instruction counts, cycles per instruction, and calculating MIPS rates for different machines. It compares RISC and CISC designs using benchmarks on VAX and IBM machines. Examples are given of calculating MIPS for different programs run on multiple computers. Metrics like CPI and speedup are analyzed for a parallel system running multiple threads across cores. Amdahl's law is discussed in relation to the actual speedup achieved.

Uploaded by

Mekonnen Wubshet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views

Computer Architecture and Organization Ch#2 Examples

This document summarizes key concepts in computer architecture and organization. It provides examples of measuring instruction counts, cycles per instruction, and calculating MIPS rates for different machines. It compares RISC and CISC designs using benchmarks on VAX and IBM machines. Examples are given of calculating MIPS for different programs run on multiple computers. Metrics like CPI and speedup are analyzed for a parallel system running multiple threads across cores. Amdahl's law is discussed in relation to the actual speedup achieved.

Uploaded by

Mekonnen Wubshet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Computer Architecture and Organization

Example (chapter 2)
1. Consider two different machines, with two different instruction sets, both of which have a clock rate of
200 MHz. The following measurements are recorded on the two machines running a given set of benchmark
programs:

Instruction Type Instruction Count Cycles per


(millions) Instruction
Machine A 8 1
Arithmetic and logic 4 3
Load and store 2 4
Branch 4 3
Others
Machine A 10 1
Arithmetic and logic 8 2
Load and store 2 4
Branch 4 3
Others
a.Determine the effective CPI, MIPS rate, and execution time for each machine.
b. Comment on the results.

b. Even though, machine B has a higher MIPS than machine A, it needs a longer CPU time to
execute the similar set of benchmark programs (instructions).

1
3. Early examples of CISC and RISC design are the VAX 11/780 and the IBM RS/6000, respectively.
Using a typical benchmark program, the following machine characteristics result:
Processor Clock Frequency Performance CPU Time
VAX 11/780 5 MHz 1 MIPS 12 x seconds
IBM RS/6000 25 MHz 18 MIPS x seconds

The final column shows that the VAX required 12 times longer than the IBM measured in CPU time.
a. What is the relative size of the instruction count of the machine code for this benchmark program
running on the two machines?
b. What are the CPI values for the two machines?

Answer:
a. The MIPs rate could be computed as the following:

[ (MIPS rate) /106 ] = Ic / T


Thus that:
Ic = T × [ (MIPS rate) /106 ]

Now by computing the ratio of the instruction count of the IBM RS/6000 to the VAX 11/780 which is:
[ x × 18] / [12x × 1] = 18x / 12x = 1.5

b. Regarding to the VAX 11/780, the CPI = (5 MHz) / (1 MIPS) = 5


Regarding to the IBM RS/6000, the CPI = (25 MHz) / (18 MIPS) = 1.4

4. Four benchmark programs are executed on three computers with the following results:

Computer A Computer B Computer C


Program 1 1 10 20
Program 2 1000 100 20
Program 3 500 1000 50
Program 4 100 800 100

The table shows the execution time in seconds, with 100,000,000 instructions executed in each of the four
programs. Calculate the MIPS values for each computer for each program.

By applying MIPS = Ic / (T × 106) = 100,000,000/(T × 106) = 100/T. Therefore,

the MIPS values are:

Computer A Computer B Computer C


Program 1 100 10 5
Program 2 0.1 1 5
Program 3 0.2 0.1 2
Program 4 2 0.125 1

2
Rank
Arithmetic
mean
Computer A 25.575 1
Computer B 2.80 3
Computer C 3.25 2

Rank
Harmonic

mean
Computer A 0.25 2
Computer B 0.21 3
Computer C 2.1 1

5. Consider the fact that the outcomes of Machine X,Y with two different instruction set, both of which
have average CPI, Ic and MIPS rate, which yielded the result of CPI = 2.24, Ic = 2 million and MIPS rate =
178. Now assume that the program can be executed in eight parallel tasks or threads with roughly equal
number of instructions executed in each task. Execution is on an 8-core system with each core (processor)
having the same performance as the single processor originally used. Coordination and synchronization
between the parts adds an extra 25,000 instruction executions to each task. Assume the same instruction
mix as in the example for each task, but increase the CPI for memory reference with cache miss to 12 cycles
due to contention for memory.
a. Determine the average CPI.
b. Determine the corresponding MIPS rate.
c. Calculate the speedup factor.
d.Compare the actual speedup factor with the theoretical speedup factor determined by Amdhal’s law.

Answer:
a. Since we have the same instruction mix, that means the additional instructions for each task could be
allocated appropriately between the instruction types. Therefore, the following table be gotten:

Instruction Type CPI Instruction Mix


Arithmetic and 1 60%
logic
Load/store with 2 18%
cache hit
Branch 4 12%
Memory reference 12 10%
with cache
miss

The average CPI = (1× 0.6) + (2 × 0.18) + (4 × 0.12) + (12 × 0.1) = 2.64. Therefore, the CPI has
been increased since the time for memory access is also increased.

b. MIPS = 400/2.64 = 152. There is a corresponding drop in the MIPS rate.

3
c. The speedup factor equals to the ratio of the execution times. The execution time is calculated
as the following: T = Ic / (MIPS × 106).
For the one processor, T1 = (2 × 106) / (178 × 106) = 11 ms.
For the 8 processors, each processor executes 1/8 of the 2 million instructions plus the 25,000

d. In fact, there are two inefficiencies in the parallel system.


The first one is that there are more additional instructions which is added to coordinate between threads.
The second one is that there is contention for memory access. Thus, none of the code is inherently serial, and all
of it is parallelizable but with scheduling overhead. It could be said that the memory access conflict means some
extent memory reference instructions are not parallelizable.
By depending on the information given, it is not obvious how to quantify this effect in Amdahl's equation.
Therefore, if it is supposed that the fraction of code ,which is parallelizable, is f = 1, then Amdahl's law decreases
to Speedup = N = 8. Therefore, the actual speedup is only about 75% of the theoretical speedup.

You might also like