0% found this document useful (0 votes)
59 views

Computer Structure Scholar Notes

Uploaded by

Appraku Stephen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Computer Structure Scholar Notes

Uploaded by

Appraku Stephen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

31

Topic 2

Computer Structure

Contents
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Computer Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.1 Calculating machines - from Babbage to integrated circuits . . . . . . . 34
2.2.2 Computer organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.3 The stored program concept . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2.4 Fetch-execute cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2.5 Two-state machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.6 Review questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.7 Structure of a small computer system . . . . . . . . . . . . . . . . . . . 43
2.2.8 Computer components and their function . . . . . . . . . . . . . . . . . 44
2.3 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3.1 Random access memory . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3.2 Read only memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3.3 Review questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.4 Cache memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.5 Memory maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.6 External memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.4 Central processing unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.4.1 The architecture of the microprocessor . . . . . . . . . . . . . . . . . . . 51
2.4.2 Accessing memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.4.3 Control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.4.4 Arithmetic logic unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4.5 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4.6 Review questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5 Buses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5.1 Address bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.5.2 Data bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.5.3 Review questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.7 End of topic test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
32 TOPIC 2. COMPUTER STRUCTURE

Prerequisite knowledge
Before studying this topic you should be able to:

Describe the purpose of a processor and list its parts;


u

Represent the data flow between the component devices of a computer system;
u

Distinguish between main memory and backing storage;


u

Describe the features and uses of Random Access Memory (RAM) and Read Only
Memory (ROM).

Learning Objectives
By the end of this topic, you will be able to:

Describe the purpose and function of the ALU and the Control Unit within a
processor;
u

Describe the purpose and function of registers;


u

Describe the purpose and function of the data bus and address bus;
u

Identify control lines within a computer including reset and interrupt;


u

Describe the purpose of read, write and timing functions of control lines;
u

Outline the steps of the fetch-execute cycle;


u

Describe and make distinctions between registers, cache memory, main memory
and backing store in terms of function and speed of access.

c
v

H ERIOT-WATT U NIVERSITY 2005


2.1. INTRODUCTION 33

Revision
The following exercise tests the prerequisites for this topic. Ensure that you are happy
with your responses before progressing.

Q1: What type of memory stores programs that must not be lost when the power to
the system is removed?
Q2: A processor will frequently transfer data. Where is the data transferred to and
from
a) Output devices
b) Input devices
c) Clock
d) Memory

2.1 Introduction
This unit on Computer Structure describes in detail the function of the component parts
of a processor in the manipulation of data.
This is extended to the methods of transferring data within a processor and between a
processor and memory.
The concept of a stored program is considered and the steps in the fetch-execute cycle
to access and run programs. Memory types are considered, from registers to backing
storage and how memory is defined and addressed.

2.2 Computer Organisation


Computers play a significant role in meeting our everyday requirements. You can now
browse the Internet for a new home, order from a supermarket on-line and have the
goods delivered to your door. You can import a new car from abroad at the touch
of a button, order clothes from a catalogue company and communicate with friends
overseas.
The ways in which you learn are also changing. You
can complete a school or college assignment using
a general purpose package, or solve programming
problems at home on your own PC, emailing your
results to your instructor for feedback. You can
use computer based learning tools to assist you in
understanding new concepts. You are using one such
tool right now!

If computers have made such a significant impact, then it makes sense to find out a little
more about how they work and how you can make use of them. For instance, how are

c
w

H ERIOT-WATT U NIVERSITY 2005


34 TOPIC 2. COMPUTER STRUCTURE

they structured? How do they operate internally? How is data represented? Why are
some computers faster and more powerful than others? What devices can you attach to
them and how can you get them to communicate?
These are some of the questions we will be looking at in this topic. There are others.
First let us take a closer look at the basic structure of a small computer system and at
how it operates internally.

2.2.1 Calculating machines - from Babbage to integrated circuits


x z

Learning Objective
At the end of this topic you will know:

the contribution of Charles Babbage to modern computer design;


|

major technological advances leading to the development of the Personal


y
Computer. {

Nowadays you use calculators to perform


numerical calculations. This was not
always the case, and for some societies,
is still not. Since early times humans
have tried to find ways to make calculations
easier. For instance the abacus was
developed by the Chinese around 1300 AD,
although similar devices had been used by
the Babylonians, since about 500 BC.
Logarithms were developed as a method
of calculation by John Napier, who
also invented a device known as
Napier’s Bones. The slide rule is a
further example of an analogue device
for multiplication that is based upon
logarithms.
Simple, manually operated mechanical calculators were developed by many famous
mathematicians, including Pascal and Leibniz.
The first steps towards automating the mechanical calculator were taken by
Charles Babbage (1792-1871) with the design of the special purpose Difference Engine
and the subsequent design of the more general purpose Analytical Engine.
The Analytical Engine was controlled by a set of instructions entered as punch holes
on a set of metal cards. This idea was first introduced by Jacquard (1752-1843) who
designed weaving looms. The pattern woven depended upon the position of holes
punched in metal cards. Unfortunately the technology of the time was not advanced
enough to allow Babbage to construct a working machine. However his ideas laid the
foundations of modern computer design. These ideas included:

a memory that could store 1000 numbers;

c
}

H ERIOT-WATT U NIVERSITY 2005


2.2. COMPUTER ORGANISATION 35

a machine controlled by a program entered into it;


~

entering a different program to perform a new task - general purpose;


~

the program was a set of instructions that, if followed, would accomplish a task -
an algorithm.

These ideas were not developed further until electro-mechanical relay technology
produced computers in the 1930s. By today’s standards these machines were large
and slow.
Vacuum tube technology in the 1940s increased the speed of the computers and by the
1960s transistor technology reduced the size and power requirements.
From about 1965 to the present, the
circuits for many operations have
been incorporated into a single chip as
Integrated Circuits (ICs). Chip fabrication
techniques also improved, allowing further
integration to the extent that we now have
Very Large Scale Integration (VLSI) and
a complete processor on a chip. This made
possible the Personal Computer.

Review Questions

Q3: Describe the Analytical Engine designed by Charles Babbage.

Q4: How does the Analytical Engine relate to modern computers?

Q5: What technical development in the 1940s and 1950s helped reduce the size and
power requirements of computers?

Q6: What affect did the development of the integrated circuit have on computers?

2.2.2 Computer organisation


Computers are digital machines that execute machine
code programs and operate on data in binary form. By
binary form we mean a representation of information as
0s and 1s.
Binary representation of information was considered
earlier in this unit. Here we examine the internal
organisation of the computer to understand how
machine code programs are run.

c


H ERIOT-WATT U NIVERSITY 2005


36 TOPIC 2. COMPUTER STRUCTURE

2.2.2.1 The Organisation of a Simple Computer

Until now we have been looking at how we can use simple logic gates to produce devices
such as adders, decoders and flip-flops. Although we have looked at simplified versions
of these logic devices, it is devices such as this that are combined together to create a
computer. We will now step back a level and look at the basic architecture of a simple
computer.
A simple computer consists of the following components (see Figure 2.1):
€

Processor;
€

Memory;
€

Input/output device;
€

Communication channels (shown between the aforementioned components in


Figure 2.1);

Figure 2.1: A Simple Computer


Input devices include the keyboard and mouse and can be used to supply input to the
processor. Output devices include the screen and printers and these can be used
to supply output from the processor. Input and output devices are often known as
peripheral devices.
Some computers have more than one processor; however we will concentrate on single
processor machines in this topic. Where there is only one processor it is known as
the Central Processing Unit, or CPU. This is where instructions are processed and
computations are carried out. This is the control centre of the computer.
The communication channels allow data and control signals to be communicated
between the main components of the computer via the system bus or external bus.
A bus is a collection of parallel wires each of which can carry a digital signal. Thus a 16-
bit wide bus could transmit 16 bits simultaneously. The CPU has its own internal bus
allowing the communication of data and control signals between its component parts.
It is worth noting here that the system bus contains lines to transmit data, lines to
transmit memory addresses and various control lines. Frequently it is thought of as
if it were separate buses: a data bus, an address bus and control lines. Sometimes the
data and address lines are not separate at all and the same lines are used for different

c


H ERIOT-WATT U NIVERSITY 2005


2.2. COMPUTER ORGANISATION 37

purposes at different times. For example, one moment sending an address to memory
and the next transmitting data from the addressed memory location to the CPU.
A more detailed diagram of the main components of a simple computer are shown in
Figure 2.2.

Figure 2.2: The Organisation of a Simple Computer


The computer illustrated in Figure 2.2 is a typical example of a
Von Neumann architecture. Virtually all computers follow this architecture model
that has its origins in the stored program concept proposed by John Von Neumann
in 1945. The basic idea behind the Stored Program Concept is that the sequence of
instructions (or program) to solve a problem should be stored in the same memory as
the data. This ensured that the computer became a general-purpose, problem-solving
tool, since to make it solve a different problem required only that a different program be
placed in memory.
The component parts of the computer are:

Central Processing Unit (CPU). Carries out computation and has overall control
of the computer.
‚

Main memory. Stores programs and data while the computer is running. Has fast
access, is directly accessible by the CPU, is limited in size and non-permanent.
‚

External memory. Holds substantial quantities of information too large for storage
in main memory. Slower access than main memory, not accessible directly by the
CPU but can be used to keep a permanent copy of programs and data.
‚

Peripheral devices (input/output devices). These allow the computer to


communicate with the outside world.

c
ƒ

H ERIOT-WATT U NIVERSITY 2005


38 TOPIC 2. COMPUTER STRUCTURE

External system bus. This allows communication of information between the


component parts of the computer.

Some possible transfers of information via the system bus are:

data transmitted from main memory to the the CPU


„

input data from an external device (e.g. the keyboard) travelling from the device to
main memory
„

information from external memory transmitted to main memory

The speed of the system bus is very important since, if it is too slow, the speed of the
CPU is restricted by having to wait for data.
The CPU typically consists of

A Control Unit (CU) which exerts overall control over the operation of the CPU;
„

An Arithmetic and Logic Unit (ALU) which carries out computation;


„

A set of registers which can hold intermediate results during a computation.

Two of these registers are of particular importance, namely,

The Program Counter (PC) which holds the address in memory of the next
instruction in the program;
„

The Instruction Register (IR) which holds the instruction currently being
executed.

These components are linked by an internal bus.


In practice, the architecture of a modern digital computer will be more complex than
the description given here, with each component itself being an assembly of parts
connected by various different buses. However, for the moment, this will suffice as a
model for how the major parts of a digital computer are organised.

2.2.2.1.1 Review Questions

Q7: What are the main components of a computer?


Q8: What is the purpose of the system bus? What type of information is it likely to
transmit?
Q9: With what important concept was John Von Neumann associated? What large
advantage did this concept confer upon computers?
Q10: What is held in the main memory of the computer? Why is external memory also
required?
Q11: How would an item of data that was entered at the keyboard finally find its way
into a CPU register for processing?

c
…

H ERIOT-WATT U NIVERSITY 2005


2.2. COMPUTER ORGANISATION 39

Q12: All CPUs contain two particular registers. What are these registers and for what
are they used?

2.2.3 The stored program concept


All computers are based upon the same basic design, known as the Von Neumann
Architecture.
Computers carry out tasks by executing machine instructions. A series of these
instructions is called a machine code program.
A machine code program is held in main memory as a stored program, a concept first
proposed by John Von Neumann in 1945.
A unit, known as the Central Processing Unit (CPU) fetches, decodes and executes the
machine instructions.
By altering the stored program
it is possible to have the
computer carry out a different
task. As a user of a desktop
computer you will already know
this. You may have loaded
a word processing program to
enter and edit text. Using
the same computer you may
have opened a spreadsheet
or drawing program to enter
numerical values or create
graphic images.

Being able to load and execute different programs allows the computer to become a
general purpose problem solving machine.

2.2.4 Fetch-execute cycle


To execute a machine code program it must first be
loaded, together with any data that it needs, into main
memory (RAM). Once loaded, it is accessible to the
CPU which fetches one instruction at a time, decodes
and executes it at electronic speed.
Fetch, decode and execute are repeated until a
program instruction to HALT is encountered. This is
known as the fetch-execute cycle.

c
†

H ERIOT-WATT U NIVERSITY 2005


40 TOPIC 2. COMPUTER STRUCTURE

2.2.4.1 Fetch execute cycle in greater detail

Earlier we introduced the fetch-execute cycle and described the stored program concept
where machine code instructions are repeatedly transferred from main memory to the
CPU for execution.
We would now like to show you how the address bus, data bus, control bus and internal
registers take part in reading a program instruction from main memory - essentially the
fetch phase of the fetch-execute cycle. Figure 2.3 below illustrates in more detail the
fetch-execute cycle.

Figure 2.3: Fetch-execute cycle

Simulation of an instruction fetch


On the web is a simulation which shows you how the buses and the internal registers of
the CPU take part in reading an instruction from main memory. You should now look at
this simulation.

c
‡

H ERIOT-WATT U NIVERSITY 2005


2.2. COMPUTER ORGANISATION 41

2.2.4.2 Registers used by the processor in the fetch-execute cycle

To accomplish the tasks, a processor has


a collection of dedicated registers. These
are used to hold information specific to
this task. Note that these registers are in
addition to the general purpose registers
provided by the processor. Unlike the
general purpose registers, these registers
are not usually visible to the assembly level
programmer.
Memory address register
The first register that we will discuss is the memory address register (MAR). This is used
to hold a value representing the address in memory that the processor needs to access.
Usually the MAR will hold a bit pattern corresponding to the state (0 or 1) of the address
bus. When the processor needs to access memory, the value of the location in memory
is placed in the MAR and the processor circuitry will ensure that the address bus lines
are set to the correct values.
Memory data register
The memory data register (MDR) is used to hold bit patterns that represent data values.
For example, when reading from memory, the MAR will be used to set up the address
lines to select a location. After a short delay, the memory device will set the lines on
the data bus to appropriate values. When the values on the data bus have settled, the
circuitry of the processor will set the value of the MDR to the value that appeared on the
data bus.
Instruction register
The instruction register is a dedicated storage space used by the control unit when it is
decoding instructions.
General purpose registers involved in the fetch-execute cycle
The program counter (PC) is the general purpose register most involved in the fetch-
execute cycle. Remember that this register is used to keep track of where in the program
execution has reached.
Other general purpose registers are only usually affected as part of the execution of the
program and as such, are not fundamental to the operation of the cycle.

2.2.4.3 The fetch phase

The first of the two main phases of the fetch-execute cycle is the fetch phase, and
consists of the following steps:

1. The contents of the PC are copied into the MAR;

2. The contents of memory at the location designated by the MAR are copied into
the MDR;

c
ˆ

H ERIOT-WATT U NIVERSITY 2005


42 TOPIC 2. COMPUTER STRUCTURE

3. The PC is incremented;

4. The contents of the MDR are copied into the IR.

Remember that the PC is used to keep track of where execution has reached. Thus the
first step is concerned with establishing the location of the next instruction to execute.
The second step is to get the value into the MDR.
The third step is to ensure that the PC points to the next instruction to be executed:
if we did not increment the PC at some point, we would continually execute the same
instruction over and over again!.
The fourth step ensures that there is a copy of the instruction in the IR ready for
execution to begin.

Sequencing the steps in an instruction fetch


On the web is an assessment that requires you to place the steps of an instruction fetch
in the correct order. You should now carry out this assessment.

2.2.4.4 The execute phase

The execute phase consists of the following steps:

1. Decode the instruction in the IR;

2. Execute the instruction in the IR.

Once the execute phase has completed, the fetch phase will be carried out again.

Animation of the fetch-execute cycle


On the web is an animation of the fetch, decode and execution of the instruction
LOAD[16]. You should now look at this animation.
Pseudocode representation of the fetch-execute cycle
For convenience we can write this series of steps as a pseudocode representation:

‰>Š>Š‹ŒŠYŽ4Y4YŽ
‘,’”“–•˜—™>š
› —™>š,œ“Y•—ž>š
‘,’>Ÿ¡ ¢“–•‘,’
—,ž,š“Y•¤£¥š
ž¦4§YЍ¦¤£¥š
©«ª4¦§¥¬,­¦¤£¥š
Y®,¨¯‰,Š>ŠY‹

“«° › —™>šœ
Note that means is copied to and that means the contents of the location
—,™>š
pointed to by .

±
c H ERIOT-WATT U NIVERSITY 2005
2.2. COMPUTER ORGANISATION 43

2.2.5 Two-state machine


The electronic components of a computer are designed
to be in only one of two states. For example, a
magnetic storage device records data magnetised in
one direction or another, transistors conduct or do
not conduct. The binary digits 0 and 1 are used to
represent these two states and hence the computer is
termed a two-state machine.

2.2.6 Review questions


Q13: Which of the following is true?

a) machine code is represented in binary


b) data is represented in decimal or hexadecimal
c) a stored program is executed from disk

Q14: What is meant by the term stored program?

Q15: The CPU:

a) was invented by John Von Neumann


b) holds a stored program
c) fetches, decodes and executes machine instructions

Q16: What is meant by the term fetch-execute cycle?

Q17: Which of the following is false?

a) magnetic storage devices are two-state devices


b) a two-state device can only be in one of two states
c) binary cannot be used to represent a two-state device

2.2.7 Structure of a small computer system


² ´

Learning Objective
At the end of this sub-topic you will know:

The main functions of the components of a small computer system;


Organisation of Main Memory and types of main memory;.


The purpose of the Address Bus, the Data Bus and the Control Bus;

– Internal registers of the CPU and their purpose;

³

The steps involved in a memory read operation. µ

We want to look at the internal organisation of the computer at a level of abstraction that
describes the system as interacting, high-level components. For now, we are not that
interested in the lower level detail of shunting bits around the machine.
To get a feel for this component level, imagine you are a car driver. When you step into
the car, what is visible to you are interfaces to the components that allow you to drive it.

c
·

H ERIOT-WATT U NIVERSITY 2005


44 TOPIC 2. COMPUTER STRUCTURE

The ignition component allows


you to start the car and to
switch it off. You do not need to
know how the ignition system
works, this is hidden from
sight. Accelerator pedals,
brake pedals and a gear
mechanism each interface to
components which alter the
speed of the car. To manouvre
the car you operate a steering
wheel which interfaces to a
steering component.
From this description, you could produce a diagram of these interacting components. It
would not be very detailed and certainly would not have enough information for a car
mechanic to work with, but it would, at a higher level, describe how the driver interacts
with the car. This is the component level we will introduce.

Identifying system components used in a task


You should find a partner to work with. Imagine that you have a PC or a Mac running a
5 min windows interface and that you have powered it up. Now think about the sequence of
events that occur when you create a new word processing document. Try to identify the
components of your PC that are involved in this task. For instance, there will need to be
a transfer of the word processing application (program) from the hard disk component to
the main memory component in order to run the program. What happens next and what
components do you think are involved?

2.2.8 Computer components and their function


The components of the CPU and the connections to devices that are external to it are
shown in Figure 2.4

c
¸

H ERIOT-WATT U NIVERSITY 2005


2.3. MEMORY 45

Figure 2.4: Components of a Small Computer System

2.3 Memory
Main memory (RAM and ROM) stores programs and data while the computer is
operating.
Memory used to be soldered onto the system board
of the processor (motherboard). The need to
provide more readily upgradable computers led to
the development of Single In-Line Memory Modules
(SIMMs). These plug into a SIMM socket on the
motherboard. Each SIMM contains a number of DRAM
chips and varies depending on the type of computer
and the amount of RAM required.
Later, we will take a closer look at the characteristics
of RAM and ROM chips and the differences between
DRAM and SRAM.
Main memory consists of a large sequence of bytes
(typically 64 Mbytes in a PC) each of which may be
directly accessed using its memory address. In a byte-
wide memory, the first byte in memory has address
0 and subsequent bytes have addresses 1,2,3, etc.
as shown in a simplified representation of RAM in
Figure 2.5

c
¹

H ERIOT-WATT U NIVERSITY 2005


46 TOPIC 2. COMPUTER STRUCTURE

Figure 2.5: A simplified representation of RAM


Any location in memory can be read from or written to by referring to its address.
Memory can be organised as:

8- bit wide (PC-8088)


º

16-bit wide (XT-8086, AT-80286)


º

32-bit wide (386DX, 486SX, 486DX)


º

64-bit wide (Pentium)

2.3.1 Random access memory


Random Access Memory (RAM) is a volatile memory.
This means that the contents of RAM are lost when
power is no longer supplied to the chip. RAM can be
written to and read from. There are two types of RAM,
namely static and dynamic (SRAM and DRAM)

SRAM chips are very fast but are not suited for very large amounts of memory. They are
more suited to cache memory, where only small amounts are required. You will learn
more about cache memory when we look at factors that affect system performance.
DRAM chips are more widely used. They are much cheaper to produce, can hold larger
amounts of data in a smaller physical area and require less power. They are dynamic,
requiring a continuous signal to refresh the contents of the chip.

2.3.2 Read only memory


Read Only Memory (ROM) is a non-volatile store
which means that the contents are held permanently.
The software and data stored on the ROM are fixed
at the time of manufacture. Once programs and
data have been entered into the ROM they cannot be
subsequently altered.

c
»

H ERIOT-WATT U NIVERSITY 2005


2.3. MEMORY 47

ROMs are used to store programs and data that do not change during the operation of
the system. These are known as mask programmed ROM.
Where different software and/or data is needed on a ROM chip, manufacturers produced
a chip that allows existing data to be erased and new data written. These are known
as electrically programmable read-only memory chips (EPROMs). Data is erased by
shining ultraviolet light onto the chip.
EPROMs have the disadvantage that all the chip contents are removed during erasure.
The entire chip has to be reprogrammed, even if only a single memory word needs to
be changed.
Another type of ROM technology where the contents of the chip can be altered is the
electrically erasable programmable read-only memory (EEPROM). By applying suitable
electrical pulses, this chip can be selectively reprogrammed which means that the entire
contents need not be erased.

2.3.3 Review questions


Q18: Which of the following is true?
a) RAM is volatile
b) ROM is volatile
c) Neither RAM nor ROM is volatile
Q19: Explain what is meant by the term non-volatile?
Q20: Which of the following memory chips can be selectively reprogrammed?
a) PROM
b) EEPROM
c) EPROM
Q21: Explain how EPROM chips can be reprogrammed and give one disadvantage of
EPROM.
Q22: Which of the following statements is false?
a) RAM cannot be written to
b) ROM can only be read from and not written to
c) EEPROM is erased using electrical pulses

ROM technologies
On the web is an activity that asks you to match ROM technologies to their descriptions.
You should now carry out this activity.

2.3.4 Cache memory


Program instructions are usually read sequentially. From one instruction it would be
reasonable to assume that the next instruction required will be in the next memory
location. This assumption is used to increase processor efficiency.
Although the movement of data within the processor is getting faster and faster, the
system buses are not keeping up. This leads to wasted time while the processor waits
for data to be fetched from memory.

c
¼

H ERIOT-WATT U NIVERSITY 2005


48 TOPIC 2. COMPUTER STRUCTURE

To reduce this problem most machines now


have a second, smaller, area of memory
known as cache memory. This is usually
SRAM which is faster than DRAM, and
although this is much smaller than RAM
there is a benefit from the fact that it is
always faster to access a small memory
segment.
When data or an instruction is read from memory, the memory locations following are
copied into the cache memory. At the next read instruction the cache memory is read
first. If the data is in cache the access time will be much lower than going to main
memory. If the data is not in cache then main memory will be accessed, and although
there is a slight loss of time from reading twice, the overall time saving in this method is
quite significant.

Observing cache memory


On the web is a simplified simulation of the operations of cache memory. You should
now look at this simulation.
The contents of cache are simultaneously held in RAM. When data is to be written back
to memory it must be written to cache so that the cache is kept current. At some stage it
will also have to be written back to RAM. Cache that has not been updated doesn’t have
to be copied back to memory it is just removed from cache when it is to be replaced by
something the processor has a greater need for. There are 2 different ways to update
cache memory.
Write through cache. When cache is updated memory is updated at the same time.
Write back cache. Cache is updated, but RAM is not updated until the content of cache
is being cleared. Write back requires fewer write operations but there is an overhead in
managing the selected updates. Write back cache is generally about 10% faster than
write through cache.

2.3.5 Memory maps


Main memory is made up of a matrix of cells. However when studying the content of
memory it is helpful to use a logical view of a set of cells, arranged in numerical order in
which the lowest memory position is location 0 and the highest is location . ½

Different processors have different ways of organising information in RAM, the BIOS and
part of the operating system will always be in the same protected areas. Certain areas
are allocated for user applications.
Memory addresses are often written in hexadecimal as they would otherwise be
awkwardly long binary strings.

Example : Memory addresses


Problem:
A processor has a 16 line address bus. If a particular memory holds an operating

c
¾

H ERIOT-WATT U NIVERSITY 2005


2.3. MEMORY 49

system from position 016 to 80016 , and a 32 Kb program starts at position 4000 16 , is
there enough free memory space for a 1 Kb block of data starting at position C000 16 ?
Solution:
O/S Program Data

02K 8K 40K 48K 64K

Step 1
There are 16 address lines. The maximum number of address locations = 2 16 = 65536
= 64 K.
Mark 64K on the memory map.
Step 2
O/S from 016 to 80016.

216 215 214 213 212 211 210 29 28 27 26 25 24 23 22 21 20


1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
¿«ÀÀÁ.Â0ÃÅÄ
Kb
Æ ÀÀÁ.Â0ïÇ
Kb
Operating system from 0 to 2 Kb
Mark this block on the memory map.
Step 3
Program from 400016 for 32 Kb
¿«ÀÀÀÁ.Âà Æ
Kb
ÆÊÉÌË Ç«ÍÎâ¿«À
so the program runs from 8 Kb to È Kb.
Mark this block on the memory map.
Step 4
The starting position for the data block is C000 16 .
This is at 48 Kb and needs 1 Kb of space. Mark this on the memory map.
The data will fit.

Describing memory maps


Q23:
A processor has 16 lines. The memory has 4 KB s of BIOS data starting at position 0.

Ï
c H ERIOT-WATT U NIVERSITY 2005
50 TOPIC 2. COMPUTER STRUCTURE

There are device drivers positioned from 2000 16 for 8 Kb A program runs from 32 K to
the top of memory.
Identify where, in memory, there is free space and how much there is.

Q24:
A processor has 32 address lines. There is BIOS data from 0 to 8 Kb The DOS kernel
lies between 300016 and 6000 16 There is a program from F0000 16 using 24 Kb There
is a data block from F00 000016 using 32 K.
Draw the memory map clearly identifying the start and end of each used memory area.

2.3.6 External memory


External memory, such as the hard disk, holds
quantities of data too large to store in main memory.
It is also used to keep a permanent copy of programs
and data.
Examples of external memory devices are:

hard disk;
Ð

floppy disk;
Ð

zip disk;
Ð

CD-R;
Ð

magnetic tape;
Ð

flash drive.

c
Ñ

H ERIOT-WATT U NIVERSITY 2005


2.4. CENTRAL PROCESSING UNIT 51

2.4 Central processing unit


Central Processing Unit. The CPU coordinates and
controls the activities of all other units in the
computer system. It executes program instructions and
manipulates data in accordance with the instructions.
The CPU is the most important component of a
computer and it is essential to have a good knowledge
of its internal organisation, i.e. its architecture. Just
as the architecture of a building refers to a structure
of rooms, facilities and access which links all parts of
the building, processor architecture refers to its internal
organisation of subsystems and how they interact.
CPUs are fairly complex and at this level of study we
will concentrate on a simplified functional description of
its structure using a standard architecture composed
of the following three components:

Arithmetic and logic unit (ALU);


Ò

Control unit;
Ò

Registers.

All three components work together to form the processor.

2.4.1 The architecture of the microprocessor


We will now study the internal architecture of the microprocessor (CPU) itself. Because
of the stored program concept, any consideration of this architecture must consider
the relationship between the CPU and memory. Figure 2.6 is a schematic diagram of a
fairly typical microprocessor design, showing the internal structure of the CPU and its
relationship to the memory of the computer.

c
Ó

H ERIOT-WATT U NIVERSITY 2005


52 TOPIC 2. COMPUTER STRUCTURE

Figure 2.6: Typical Microprocessor Design


We will now look at the role that these components play in the operation of the processor.

2.4.2 Accessing memory


The CPU has to access memory both for instructions and to receive and transmit data
from or to memory. For this purpose it typically has two internal registers, namely:
Ô

Memory Address Register (MAR) - specifies the address in memory for the next
read or write operation from or to memory;
Ô

The Memory Data Register (MDR) or Memory Buffer Register (MBR) -


contains the data to be written to memory or receives the data read from memory.

The MAR register is connected to the address portion of the system bus and the MDR
register is connected to the data portion of the system bus.
Ô

To read data from memory, the CPU places the address of the required memory
location into the MAR and activates the memory-read control line of the system
bus. This will cause the required data to be transmitted from memory via the data
bus to the MDR;
Ô

To write from the CPU to memory, the CPU places the data to be written in the
MDR; the address of the memory location where they are to be written is placed
in the MAR; and the memory-write control line is activated.

The MAR and MDR registers have a large part to play in the fetch-execute cycle.
When fetching an instruction from memory during the fetch-execute cycle, the address
contained in the PC will be copied to the MAR using the processor’s internal bus. When
the memory-read control line is activated, the instruction will be sent to the MDR using
the data bus. From the MDR it will be copied to the IR using the processor’s internal
bus.
When executing an add to accumulator instruction, the address part of the instruction
will be sent to the MAR so that the operand can be obtained from memory. The operand

c
Õ

H ERIOT-WATT U NIVERSITY 2005


2.4. CENTRAL PROCESSING UNIT 53

is then placed in the MDR from where it can be sent to the ALU, via the CPU internal
bus, for adding to the contents of the accumulator.

2.4.3 Control unit


The machine code programs
stored in main memory tell the
computer what steps must be
carried out to solve a problem.
They also tell it the sequence
in which it must carry out the
steps.
The control unit includes
timing/ control logic and the
instruction decoder. It sends
signals to other parts of the
computer to direct the fetch
and execution of machine
instructions.
Using timing and control signals, it tells other parts of the system what to do and when
to do it, i.e. it synchronises the whole system.
Signals are sent out and received on the control bus. For example, if data is to be read
from main memory, the control unit will initiate a read signal on the control bus. Some
signals, such as interrupts, originate from external devices, acting as inputs to the CPU.
The control bus is not really a bus at all. Unlike the address and data buses where bits
are simultaneously transmitted along a set of parallel wires, the control bus is made up
of discrete wires, each having a specific function. These functions are described in the
table below:
Control
Function
Line
generates a constant pulse at a frequency measured in Hertz (Hz).
Each pulse causes a machine operation to be carried out. Machine
clock
operations include reading data from main memory or adding numbers
together.
causes the processor to halt the execution of the stored program. All
reset
internal registers are cleared and the machine reboots.
tells the processor that an external event has occurred, such as the
interrupt transfer of data from an external device. The processor may ignore this
type of interrupt.
a non-maskable interrupt that cannot be ignored by the processor. For
NMI
example, low power failure.

When data is to be read from a memory location then the control unit will initiate a read
signal on the control bus and when data is to be written to a memory location then the
control unit will initiate a write signal. These operations are described later in greater
detail when we take a closer look how an instruction is fetched from main memory.

c
Ö

H ERIOT-WATT U NIVERSITY 2005


54 TOPIC 2. COMPUTER STRUCTURE

Identifying functions of the control bus


On the web is an activity that asks you to correctly identify 4 functions of the control bus.
You should now carry out this activity.

2.4.3.1 Getting the processor’s attention

A computer receives signals from a number of different sources. Characters keyed in


on the keyboard, the click of a mouse, data from a scanner. The arrival of this type of
signal is not necessarily expected at any particular time and the computer has to have
a way of detecting them.
There are two ways that this can happen, known as polling and interrupts.

2.4.3.2 Polling

The first is known as polling. This is what certain types of door to door salespeople do.
They ensure that all their customers have a copy of the company catalogue and then
they visit every house on a rota basis to ask if the customer would like to place an order.
This means that the salesperson will visit every house, say, every month.
They may know that the average time between orders is about 3 months but they
can’t take the risk of leaving the customer unattended when they might want to place
a big order. This means that the salesperson may be wasting a lot of time making
unnecessary calls. From the customer’s point of view this is not ideal either, because
they might realise a week after the last visit that they have forgotten something important
and there is no way to shorten the waiting time until next month.
For a computer system this would work quite satisfactorily if the processor was running
a microwave oven because the processor would be dedicated to that one task and
efficiency would be a meaningless concept.
The life of the door to door salesperson would be simpler and they would be able
to handle far more customers, if the customer was given a phone number or e-mail
address and asked to initiate contact when they wanted to order. This would provide
the customer with a much better service and also allow the salesperson more time to do
other things.

2.4.3.3 Interrupts

In computer terms, the signal from a peripheral device or program that the attention of
the processor is needed is known as an interrupt. Every time a keyboard key is pressed
an interrupt is generated. When the machine is designed, the handling of interrupts is
planned for.
In an IBM type PC there is an allowance for 256 different types of interrupt. When one of
these occurs the system is able to identify its type. With this information the processor
then looks at an area in memory in which an address for each of the 256 interrupts is
stored. At this address there is a program known as an Interrupt Service Routine. The

c
×

H ERIOT-WATT U NIVERSITY 2005


2.4. CENTRAL PROCESSING UNIT 55

address table is used to furnish addresses indirectly because this makes it possible for
a programmer to control. Many of the ISRs are stored in ROM.
When an interrupt is received, the processor will:

store the contents of its internal registers in an area of memory called the stack;
Ø

find the address for the ISR;


Ø

jump to the service routine and process it;


Ø

reload the internal registers from the stack;


Ø

continue processing from where it stopped.

There are different priority levels assigned to interrupts. If an interrupt arrives while
the processor is already dealing with one, it can do one of two things. If the second
interrupt is of a lower priority the processor will carry on until the current interrupt has
been serviced then it will service the second. If the new interrupt is of a high priority
the processor will store it’s current state on the stack and start to process the newer
interrupt. When that one is finished it will complete the processing of the first interrupt
and then revert to the original process. This is known as nesting interrupts.
Interrupts can be generated by hardware or software.
Any signal coming from a peripheral device will prompt an interrupt. Each I/O connection
has a physical link called an IRQ line. This serves to carry the interrupt signal and in turn
to identify the source of the interrupt. Internal devices connected via the motherboard
also use IRQ lines. There is a limited number of IRQ lines and this can be a limiting
factor in adding hardware to a computer system. However the newer systems using USB
(Universal Serial Bus) or Firewire can accept a number of hardware devices sharing IRQ
lines.
A software interrupt is one generated from within a program. This includes routine
activities like sending a character to the keyboard or the screen. A software fault like
trying to divide by zero, or trying to write into a protected area of memory, will also call
an interrupt.
Generally the processor can mask or delay the servicing of interrupts until it is ready.
There is, however, a group of interrupts that cannot be ignored. These are known non-
maskable interrupts (NMIs). Typically this interrupt would indicate a problem such as
loss of power, requiring the computer to shut down immediately.

2.4.4 Arithmetic logic unit


The arithmetic logic unit (ALU) is where data is
processed and manipulated and can be considered the
"brain" of the computer.
Processing can involve either arithmetic operations and
the ALU must contain circuitry to perform additions.
Note that multiplication can be achieved through a
series of additions, while division can be achieved
through a series of subtractions.

c
Ù

H ERIOT-WATT U NIVERSITY 2005


56 TOPIC 2. COMPUTER STRUCTURE

The ALU may also perform logical operations such as a logical OR operation. This
requires in-built logic elements.
The ALU uses arithmetic registers which are storage locations used to hold data and
temporary results of calculations. One special storage register used by the ALU is the
accumulator which it uses to hold the results of additions. Typical operations performed
by the ALU include:

addition;
Ú

subtraction;
Ú

shift left;
Ú

shift right;
Ú

logical OR;
Ú

logical AND;
Ú

increment;
Ú

decrement.

2.4.5 Registers
A register is a storage location used to hold instructions,
memory addresses, data or the temporary results of
calculations. Because registers are internal to the
processor, they can be accessed at high speed.
The CPU has a set of general and special purpose
registers and the number and type of registers in
any one CPU will be different from those in another
CPU. Special purpose registers typically found within
a processor are listed below:
memory address register (MAR) is used to hold the address of a location in main
memory.
memory buffer register (MBR) is used to hold data that has just been read from main
memory or is to be written to main memory.
instruction register (IR) is used to hold the current instruction that is being executed.
program counter (PC) holds the address of the next instruction to be fetched from
memory.
The processor also has a set of general purpose registers. They are called general
purpose because their role is not defined at manufacture and can be used by
programmers as appropriate.

2.4.6 Review questions


Q25: What is the function of the CPU?
Q26: Which of the following is NOT a component of the CPU?

c
Û

H ERIOT-WATT U NIVERSITY 2005


2.5. BUSES 57

a) ALU
b) RAM
c) Special purpose registers

Q27: How does the control unit synchronise operations within the computer?

Q28: Which of the following describes how the ALU performs multiplication?

a) using a logical OR operation


b) successive addition
c) using an increment operation

Q29: Why are general purpose registers provided within the CPU?

Matching CPU registers to their purpose


On the web is an activity that asks you to match some CPU registers to their purpose.
You should now carry out this activity.

Matching CPU components to their descriptions


On the web is an activity that asks you to match CPU components to their descriptions.
You should now carry out this activity.

2.5 Buses
The system bus is a group of parallel wires, each
carrying a single bit of data.
In a single bus system, input/output devices (I/O) and
memory use the same communications channel.

A two bus system has a separate I/O channel and memory transfer channel. Larger
systems make use of several I/O buses for more effective operation.
A single bus system is typical of small computer systems.
The system bus must provide components with the use of a Data Bus, Address Bus and
Control Bus.

c
Ü

H ERIOT-WATT U NIVERSITY 2005


58 TOPIC 2. COMPUTER STRUCTURE

2.5.1 Address bus


The address bus is a uni-directional bus, transferring
information in one direction only.
In a single bus system, input/output devices (I/O) and
memory use the same communications channel. When
the CPU needs to put data into memory or send it to
a disk then it does so in the same way, only using
different addresses. Devices are memory mapped and
are treated by the system as if they were memory.
The address bus is made up of parallel wires, each capable of carrying 1 bit. The size of
the address bus will determine how many memory locations can be directly addressed.
To understand this, consider first an address bus width of 1-bit. There are 2 distinct
values that a single bit can represent (0 or 1). Thus a bus width of 1-bit can identify 2
unique addresses.

Now add another address line. There are now 2 lines, each of which can represent 2
distinct values. This results in 4 possible unique addresses shown below.

Now add another address line. There are now 3 lines, each of which can represent 2
distinct values, resulting in 8 unique addresses.

Can you think of a general formula to relate the number of directly addressable memory
locations to the width of the address bus? If not, try answering the questions below and
then think through the problem again.

Q30: How many memory locations can be directly addressed using 5 bits?

a) 10
b) 7
c) 32

Q31: How many memory locations can be directly addressed using 8 bits?

a) 8
b) 256
c) 16

c
Ý

H ERIOT-WATT U NIVERSITY 2005


2.5. BUSES 59

Generally then,
The number of memory locations = 2width of address bus
Thus, a 24-bit address bus will be able to distinguish between:
224 = 16,777,216 memory locations.

2.5.1.1 Addressability

When the processor has to send or receive


data or instructions from memory, cache or
external devices, the medium along which
the signal is carried is known as a bus.
This is a set of wires or lines that can each
transmit 1 bit at a time. Thus when an
instruction is read from memory the bits
are placed on the data bus and moved in
parallel to the processor. Typically the data
bus will be the same size as one memory
location.
While the data bus moves data and
instructions, a second bus, the address bus
carries the address of the memory location
to be accessed.
The size of the address bus affects the
amount of main memory offered. If there
were 8 address lines able to transmit
8 bits the maximum number of different
addresses would be Þ«ßÊàÞáâ
A 32 bit address bus addressing up to 4 Gb
of memory is a more typical current size.
When an address is being specified there has to be a way to determine whether it refers
to a main memory address or to one of the I/O interfaces that control communication
with other peripheral devices. This can be done in two ways.
Memory mapped I/O
If the interface is memory mapped, a block of main memory addresses is mapped to the
I/O interface. In this way the addressing process is exactly the same for main memory
or I/O.
If there is no memory mapping the destination is defined by a signal on one of the control
lines. This means that the same address values can be used for I/O and memory without
confusion.
The address bus is effectively one way (uni-directional) while the data bus can transfer
data in both directions (bi-directional), outwards for a write operation and inwards for a
read operation.
A third bus, the control bus has the signal that identifies the type of operation, read or

ã
c H ERIOT-WATT U NIVERSITY 2005
60 TOPIC 2. COMPUTER STRUCTURE

write. This can take 4 signals:

read from memory;


ä

write to memory;
ä

read from I/O;


ä

write to I/O.

Therefore to fetch an instruction or data item from memory all 3 buses are used.

Address bus address in memory or I/O device


data to be written to or data being read
Data bus
from memory
operation read / write memory or
Control bus
read/write I/O
This is illustrated in Figure 2.7
Data bus

Control bus
CPU
Address bus

I/O interface

Cache I/O interface

Memory

Figure 2.7:

2.5.2 Data bus


The data bus is used to transfer data to and from the
CPU.
The data bus is a bi-directional bus which transfers
data in both directions.

In a single bus system, the data bus is shared by main memory and external devices
such as screens, printers and disk drives.
You can appreciate why this bus needs to be bi-directional if you consider some typical
operations that are carried out. For instance, the CPU must fetch instructions from
main memory which requires transfer in the direction from main memory to the CPU. If

c
å

H ERIOT-WATT U NIVERSITY 2005


2.6. SUMMARY 61

the stored program has instructions to calculate values and update variables, then the
results of the calculations need to be stored in main memory. This requires a transfer of
data from the CPU registers to main memory.
In the case of communicating with an external device, such as a hard disk, data must be
loaded from the device and also saved to the device. This requires bi-directional data
transfer.

2.5.3 Review questions


Q32: The purpose of the address bus is to:

a) initiate a read from memory operation


b) carry a memory address from which data can be read or to which data can be written
c) store results of calculations

Q33: Why is the address bus described as uni-directional?

Q34: The data bus is used:

a) to store the results of calculations


b) to signal a read event
c) to carry data/instructions from main memory to CPU or to carry data from CPU to
main memory

2.6 Summary
The following summary points are related to the learning objectives in the topic
introduction:

Organisation of the component parts of a computer system;


æ

Description and structure of the components parts of a processor;


æ

The stored program concept and the fetch-execute cycle;


æ

The function of the processor components;


æ

Control lines functions and timings;


æ

The use of buses;


æ

The storage of data using registers, cache, memory and backing storage;
æ

Accessing memory.

2.7 End of topic test


An online assessment is provided to help you review this topic.

c
ç

H ERIOT-WATT U NIVERSITY 2005


62 TOPIC 2. COMPUTER STRUCTURE

c
è

H ERIOT-WATT U NIVERSITY 2005


63

Topic 3

Computer Performance

Contents
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2 Measuring performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.1 Clock speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2.2 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2.3 FLOPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2.4 Benchmark tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3 Performance factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.1 Data bus width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.2 Data exchange with peripherals . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.3 Main memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.4 Cache memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.3.5 Virtual storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.3.6 Review questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.5 End of topic test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Prerequisite knowledge
Before studying this topic you should be able to:

Describe the uses and compare the features of embedded, palmtop, laptop,
desktop and mainframe computers;
é

Make comparisons in terms of processor speed, main memory, backing store and
peripherals;
é

Describe clock speed as an indicator of system performance.

Learning Objectives
By the end of this toipc you will be able to:

Describe and evaluate measures of computer performance including clock speed,


MIPS, FLOPS, and application benchmark tests;
é

Describe the factors affecting system performance including data bus width, cache
memory and data transfer rates between peripherals;
64 TOPIC 3. COMPUTER PERFORMANCE

Describe the effects of increases in clock speed, memory and storage capacity.

c
ë

H ERIOT-WATT U NIVERSITY 2005


3.1. INTRODUCTION 65

Revision
The following exercise tests the prerequisites for this topic. Ensure that you are happy
with your responses before progressing.

Q1: What supplies regular control signals to the processor that controls the timing of
the data transfers and processes?
a) ALU
b) clock
c) control bus
d) control unit
Q2: A palmtop computer system is an example of a :
a) Embedded computer
b) Mainframe computer
c) Minicomputer
d) Microcomputer
Q3: Which of the following gives an indication of a computer’s performance?
a) Modem speed
b) Modem bandwidth
c) Clock speed
d) VDU replenishing rate

3.1 Introduction
This unit on Computer Performance considers the factors that affect the performance
of a computer system. Individual factors may well have an effect, but it is important
to consider all factors collectively to allow for a meaningful analysis of a system’s
performance.
Different measurements of performance are considered along with current trends in
improving computer specifications that aim for higher performance.

3.2 Measuring performance


When we talk about the performance of a computer we are usually interested in how
quickly it can carry out instructions, or how many instructions it can execute per second.
This is often talked about in terms of MIPS, or Millions of Instructions Per Second.
The MIPS rate can be influenced by a number of factors:
ì

the clock speed of the processor;


ì

the speed of communication lines (buses);


ì

the speed of memory access;

c
í

H ERIOT-WATT U NIVERSITY 2005


66 TOPIC 3. COMPUTER PERFORMANCE

the speed of execution of instructions.

It is important to note that improvements in technologies and design in each of these


areas have a cost, and it is the developments with the best performance to cost ratio
that succeed.

3.2.1 Clock speed


One of the prime factors affecting the performance of microprocessors is the clock
speed at which they run. Every processor has an internal crystal-controlled clock
which generates pulses at a regular rate. These pulses are used to synchronise the
steps carried out by the microprocessor while carrying out the fetch-execute cycle. All
processor activities will start on a clock pulse; for example, fetching an instruction,
placing data in the Memory Data Register, transferring an operand from a general-
purpose register to the ALU, etc.
The time between pulses is the cycle time. Early microprocessors had clock speeds
measured in kHz (thousands of cycles per second) while modern processors such as
the Pentium III can achieve speeds of about 1GHz (thousand million cycles per second),
and the Pentium 4 above 2GHz. Technology such as VLSI allowed great improvements
to be made in clock speeds.
Obviously clock speed is an important factor in determining the performance of a
microprocessor. Thus a microprocessor running at 200MHz is likely to execute
instructions faster than one which runs at 100MHz. However, when it comes to judging
performance between competing processors, clock speed may not always be a reliable
measure of relative performance. One of the reasons for this is that an instruction such
as an add may take several cycles, the number of cycles required increasing with the
complexity of the addressing method used for the operands. Table 3.1 tabulates the
clock speed versus the performance of Intel processors as measured in MIPS. MIPS is
now an outdated way to measure performance but it is the only measure applicable over
the whole range.

c
ï

H ERIOT-WATT U NIVERSITY 2005


3.2. MEASURING PERFORMANCE 67

Table 3.1: Clock Speed Versus Performance

Intel Processor MIPS Clock Speed Year


4004 - first microprocessor on a chip 0.06 108kHz 1971
8008 - first 8-bit microprocessor 0.06 200 kHz 1972
8080 - first general purpose CPU on a chip 0.64 2MHz 1974
0.33-
8086 - first 16-bit CPU on a chip 5-10 MHz 1978
0.75
0.33-
8088 5-8 MHz 1979
0.75
0.9-
80286 (286) 6-12 MHz 1982
2.66
80386DX - first 32-bit CPU 5-11.4 16-33 MHz 1985
80486DX 20-41 25-50 MHz 1989
Pentium - (pentium from Greek for five) 100 60 MHz 1993
Pentium Pro 150-200 MHz 1995
Pentium II 233-300 MHz 1997
Pentium III 450-600 MHz 1999
Pentium 4 1.4-1.8 GHz 2001
Table 3.1 shows that the performance as measured by MIPS has gone up at a higher
rate than has the clock rate. For example, on these figures the average number of clock
cycles required for an instruction in the Intel 8086 processor was about 10, while in the
Pentium Pro, about two instructions per clock cycle are achieved. We shall look later at
why this has happened.

3.2.2 MIPS
In an earlier topic you met the clock when we introduced the functions of the control unit.
You learned that it was an electronic pulse, similar to a musical metronome, generated
at a constant frequency and that on each pulse, machine operations were carried out.
Clock speed will clearly have an impact on performance. If more pulses can be
generated per second, with machine operations carried out on each pulse, then it is
safe to conclude that more machine operations are carried out as clock speed increases.
This is readily observed in the slower performance of older PCs, running at clock speeds
of 200 MHz, with most modern PCs, typically operating at around 1.2GHz.

The effect of clock speed on performance


On the web is a simulation that shows you the difference in performance of two
computers operating at different clock speeds. You should now look at this simulation.
Care needs to be taken when simply comparing clock speeds. For instance if we
compare one processor operating at say, 200MHz with a different processor operating
at 200MHz can we say that they are equivalent in performance? No, we cannot.

c
ð

H ERIOT-WATT U NIVERSITY 2005


68 TOPIC 3. COMPUTER PERFORMANCE

Mip rate
A clock speed of 200 MHz does not mean that 200 million instructions are executed per
second. It can take at least five clock pulses to execute an instruction. One to load
the instruction, one to decode it, one to get any data that the instruction needs, one to
execute the instruction and one to store the result. In this case, a 200 MHz processor
would be capable of executing 40 million instructions per second. This is known as the
machine cycle time, often expressed in mips (millions of instructions per second).
It may therefore be the case that two processors have the same clock speed but different
mip rates.

3.2.3 FLOPS
Floating point operations per second
You should be aware that using mip rate as a comparison factor also has problems.
What sort of instructions are being carried out? There is no standard set and so
some manufacturers could use simpler and faster instructions than others. A better
measure of performance is the Flop (floating point operations per second). The
procedures involved in doing a floating point multiplication are basically the same for
every processor. As these kinds of operations are used in most software they provide
the basis of a reasonable comparison of system performance.
In the past 10 years processor clock speeds have increased at a phenomenal rate. The
laws of Nature, however, limit how far clock speeds can be increased.

3.2.4 Benchmark tests


A benchmark is a well defined standardised routine used to test the performance of
computer systems. (Benchmark testing is also used to test software performance). They
consist of standard operations that measure the speed of processing in terms of floating
point operations per second (FLOPS) and in some cases the number of instructions
performed per second (MIPS).
Examples of benchmarks are the Dhrystone and Whetstone tests. The Dhrystone test
measures the processor’s performance in executing frequently used statements and
string comparisons. The Whetstone test measures the processor’s performance in
executing arithmetic functions.
A test to measure the efficient use of memory whilst running applications is the
MemStone test. Basic system operations such as memory allocation, and reading and
writing from various memory blocks are measured. Testing the speed for file access is
done by randomly reading and writing data from a block of memory that is saved to a
file. Also, a series of virtual memory allocations made to test performance for creating
and freeing memory blocks.

c
ñ

H ERIOT-WATT U NIVERSITY 2005


3.3. PERFORMANCE FACTORS 69

3.3 Performance factors


3.3.1 Data bus width
The effect of data bus width on performance
On the web is an animation that shows you how the width of the data bus affects
computer performance. You should now look at this animation.
Word size
A computer is described in terms of its word size. This is the basic number of bits that
the processor can handle in a single operation. Thus a 32-bit processor can handle 32
bits in a single operation.
An 8-bit processor can add together two 32-bit numbers but this would take quite a
few operations, whereas a 32-bit processor could perform the same task in a single
operation.
If the word size of the computer and the data bus width are the same, this allows data
transfers to and from main memory to be carried out in a single operation. However,
computers are not always designed like this and often compromises are made due to
chip fabrication and manufacturing costs. For instance, designers may build a 32-bit
machine with a 16-bit data bus. This means that a 32 bit word must be fetched from
main memory using 2 memory read operations; one to fetch the first 16 bits of the data
and the second to get the remaining 16 bits. Clearly this is slower than a 32-bit machine
designed to carry 32 bits on its data bus.
The overall performance is not necessarily twice as slow as there are other factors
to consider. However we can say that a wider bus width will produce increased
performance.

3.3.1.1 Trends in data bus width increases

64-bit computers appeared around 1993, however most of today’s processors have a
32-bit word size. Increases in data bus widths and clock speeds of the Intel processor
series, from 1st generation to the current 7th generation are shown in Table 3.2

Table 3.2: Intel processor series


Type/Generation Year Data Bus Width
8088 - First 1979 8 Bit
80286 - Second 1982 16 Bit
80386SX - Third 1988 16 Bit
80486SX - Fourth 1989 32 Bit
Pentium - Fifth 1993 64 Bit
Pentium Pro - Sixth 1995 64 Bit
AMD Athlon - Seventh 1999 64 Bit

3.3.2 Data exchange with peripherals


Peripheral devices connect to the system bus, or I/O bus, via slots on the back of
the computer. Such devices include printers, scanners, digital cameras, digital video

c
ò

H ERIOT-WATT U NIVERSITY 2005


70 TOPIC 3. COMPUTER PERFORMANCE

recorders, mice, keyboards etc. They also include mass storage devices such as
magnetic tape drives and disk drives.
Each device has an operational speed, uses its own language and deals with different
amounts of data at a time.
In order for these devices to communicate with
the CPU they need to be interfaced. An
interface is a unit that sits between the CPU
and a peripheral device and compensates for
the differences in speed, codes etc. to ensure
compatibility.

3.3.2.1 Standard functions of an interface

Every interface will need to carry out the following:


convert data from the format understood by
the processor to the format understood by the
peripheral

hold data in a Buffer as it is transferred from


the processor to the peripheral and vice versa;

transmit/receive control signals to/from the


CPU;

maintainstatus information that informs the


processor whether the peripheral is ready to
send or to receive data.

Parallel to serial conversion


On the web is a simulation that illustrates parallel to serial conversion. You should now
5 min look at this animation.

c
ó

H ERIOT-WATT U NIVERSITY 2005


3.3. PERFORMANCE FACTORS 71

Identifying functions of an interface


On the web is an activity that asks you to identify the standard functions of an interface.
You should now carry out this activity
.

3.3.2.2 Data exchange with peripherals

Data can be transmitted in serial or parallel


form. Most PCs have a combination of

serial
both with at least one parallel port.
The keyboard and mouse connect through
a serial interface, while printers, zip drives
and CD-ROMs (requiring larger amounts
and much faster transfer rates) connect to
a parallel interface. prin
ter
zip

Serial and parallel communications


operate differently.

3.3.2.2.1 Parallel transmission

With parallel transmission, each bit of an


8-bit byte is sent at the same time along a set
of parallel wires. The intention being that all
bits of the byte arrive at their destination at the
same time.
Parallel transmission is clearly faster than
sending out a single bit at a time, but is
recommended where the distance between
the transmitting device and the receiving
device is fairly short, for example, connecting a
printer to a PC.

Over longer distances there is a possibility of skewing, where the individual bits may
arrive at their destination at different times. The data will lose its integrity. For longer
distances, where speed is not essential, serial communication is more practical.
Parallel data transmission is illustrated in Figure 3.1

c
ô

H ERIOT-WATT U NIVERSITY 2005


72 TOPIC 3. COMPUTER PERFORMANCE

õ>öY÷øöúùûù üúùYýúöYþÿöY÷.ö


1 0 1 0

1 0 1 0
each bit is transmitted on
single unit
single line at the same time
of data
1 0 1 0

1 0 1 0

bit-time

Figure 3.1: Parallel data communications

Example : Communicating with a Printer


Problem:
A printer is connected to a computer using a standard, Centronics parallel interface.
What follows is the sequence of events to transfer data to the printer.
Solution:
1. The interface puts data on the parallel lines.
2. Signals the byte is ready to be transmitted.
3. If ready, the printer reads the byte transmitted.
4. Printer sends back an acknowledge signal to the CPU.
5. Interface prepares the next byte.

The interface also contains a set of status wires that can be used to signal events such
as:

wait because the printer is busy processing data already sent;


error, such as a paper jam occurring or the printer is out of paper.

Identifying the characteristics of parallel communications


On the web is an activity that asks you to identify the characteristics of Parallel
Communications. You should now carry out this activity.

c H ERIOT-WATT U NIVERSITY 2005


3.3. PERFORMANCE FACTORS 73

3.3.2.2.2 Serial transmission

With serial transmission, each bit of the byte


is sent out, one at a time, over the
communications line. With
asynchronous transmission the process of
sending out the bits can be started as soon as
the byte is available.
A start bit is used to signal to the receiving
device that transmission is beginning, followed
by each bit of the byte. There may or may not
be a parity bit sent as part of the byte. Finally
the transmitter waits a period of time, marking
an output level that is known as a stop bit
Asynchronous serial communication of a single byte is illustrated in Figure 3.2

Figure 3.2: Serial data communications


With synchronous transmission data between two devices is timed to coincide with a
clock pulse.

Example : Communicating with a mouse


Problem:
As there is no pressing need for speed, a mouse is connected to a computer using a

H ERIOT-WATT U NIVERSITY 2005


74 TOPIC 3. COMPUTER PERFORMANCE

serial DB9 or DB25 connector. A universal serial bus (USB) may also be used. A USB
interface is a typical port used to connect many devices.
Solution:
1. A single data bit at a time sent to the CPU - typical rate of 1200 bits per second.
2. The interface converts this stream of bits to parallel to send on system bus.
3. Once the byte ready interface asserts an interrupt request.
4. CPU acknowledges request and interface places the byte on the system bus.

The CPU may choose to ignore the interrupt request which may lead to data loss. This
is known as data overrun. In this case the interface must tell the CPU that data has
been lost.

Identifying the characteristics of serial communications


On the web is an activity that asks you to correctly identify the characteristics of Serial
Communications. You should now carry out this activity.

3.3.3 Main memory


There are various aspects of main memory that can affect system performance. These
include:

speed of access;

word size;

amount of memory;

cache memory.

The first two aspects are dictated by the processor and logic board (motherboard). We
will look in more detail at how the amount of memory and the use of cache memory
affects system performance.

c

H ERIOT-WATT U NIVERSITY 2005


3.3. PERFORMANCE FACTORS 75

Amount of memory
Main memory is a mixture of random access
memory (RAM), read only memory (ROM)
and empty space. Empty space means there
is less physical memory present than can be
directly addressed.
Physical memory can therefore be expanded
by adding more memory modules as and
when required. This is known as a
memory upgrade.

Empty RAM
If your computer is struggling to run some
software or you cannot load all the software
you want at the one time, then adding extra
memory will improve your system’s
performance. For example, if you are using an
application that needs to manipulate large
graphic files, video or sound then you should
be thinking of upgrading RAM to the
computer’s maximum capability.
Machines typically come with 32 Mbytes or
more RAM which can be upgraded to 256
Mbytes or even more.

Full RAM
When there is insufficient main memory then the hard disk can be used as an extension.
This is known as virtual memory and results in slower performance since swapping
data from main memory to hard disk and loading from the hard disk to main memory is
much slower than directly accessing main memory.

c

H ERIOT-WATT U NIVERSITY 2005


76 TOPIC 3. COMPUTER PERFORMANCE

Cache memory
Main memory bus speeds are not able to
match the speed of the CPU and
cache memory is used to speed up this
transfer. This is a small amount of very fast
SRAM that can reside inside the processor or
sit between the processor and main memory.

When writing to main memory the CPU uses the cache to deposit data and then
resumes its operations immediately. The data is transferred to main memory by the
cache controller circuitry.
When reading from memory the CPU first checks whether the information is already
available in the cache memory. If so then it can transfer this at high speed to the CPU.

How cache memory is used by the processor


On the web is a simulation showing how the processor makes use of cache memory.
You should now look at this simulation.

3.3.4 Cache memory


Program instructions are usually read sequentially. From one instruction it would be
reasonable to assume that the next instruction required will be in the next memory
location. This assumption is used to increase processor efficiency.
Although the movement of data within the processor is getting faster and faster, the
system buses are not keeping up. This leads to wasted time while the processor waits
for data to be fetched from memory.
To reduce this problem most machines now have a second,
smaller, area of memory known as cache memory. This is
usually SRAM which is faster than DRAM, and although this is
much smaller than RAM there is a benefit from the fact that it
is always faster to access a small memory segment.
When data or an instruction is read from memory, the memory locations following are
copied into the cache memory. At the next read instruction the cache memory is read
first. If the data is in cache the access time will be much lower than going to main
memory. If the data is not in cache then main memory will be accessed, and although
there is a slight loss of time from reading twice, the overall time saving in this method is
quite significant.

Observing cache memory


On the web is a simplified simulation of the operations of cache memory. You should
now look at this simulation.
The contents of cache are simultaneously held in RAM. When data is to be written back
to memory it must be written to cache so that the cache is kept current. At some stage it
will also have to be written back to RAM. Cache that has not been updated doesn’t have
to be copied back to memory it is just removed from cache when it is to be replaced by

c

H ERIOT-WATT U NIVERSITY 2005


3.3. PERFORMANCE FACTORS 77

something the processor has a greater need for. There are 2 different ways to update
cache memory.
Write through cache. When cache is updated memory is updated at the same time.
Write back cache. Cache is updated, but RAM is not updated until the content of cache
is being cleared. Write back requires fewer write operations but there is an overhead in
managing the selected updates. Write back cache is generally about 10% faster than
write through cache.

3.3.5 Virtual storage


A typical processor today may have a 32-bit address space allowing it to address 4Gb
of memory; however, it is rare to find a machine equipped with a full 4Gb of RAM. In
contrast, it is very common to find machines equipped with large, cheap amounts of
hard disk – which can easily be in excess of 40Gb. It therefore seems quite apparent
that using some of the hard disk as slow memory would allow us to utilise the full address
space of the processor.
Prior to virtual memory, if data was too large to fit into the main memory the programmer
had to break the data up into smaller sections, called overlays, each of which could fit
within the main memory. The overlays would be stored on disk, and loaded into the
main memory as required. However, the programmer had to manage the process of
switching overlays and communicating between them. This approach was commonly
used by applications forced to run on small and simplistic operating systems such as
DOS.
Virtual memory not only automates the transfer of data from disk to main memory,
but does it in such a way that the entire address space appears usable compared to
overlays. Furthermore, virtual memory can be allocated on a per process basis, i.e. if
the operating system is running multiple processes (multiprocessing), then each process
appears to have the full address space entirely to itself. Multiprocessing operating
systems are now common; typical examples include Windows XP, Linux, OSX etc.
Virtual memory and cache memory share some similarities, and it may be convenient
to think of virtual memory as yet another layer on the bottom of the memory hierarchy.
Unlike cache, virtual memory requires support of the operating system at the very least
to access the hard disk. In addition, the algorithms used to manage virtual memory are
typically far more advanced, since they must compensate for the slow speed of the hard
disk and coordinate the sharing of memory between the processes that are running.
It is therefore not uncommon to find sophisticated sharing, compression, and access
prediction techniques being used to optimise virtual memory management.

3.3.6 Review questions


Q4: Computer A and Computer B each operate at 200 MHz. Why can it not be said
that they are equivalent in performance?

Q5: What is meant by the term Virtual Memory?

a) memory that is not real


b) using the hard disk as an extension of main memory
c) adding more RAM modules

c

H ERIOT-WATT U NIVERSITY 2005


78 TOPIC 3. COMPUTER PERFORMANCE

Q6: Explain why an increase in the width of the data bus can improve system
performance.

Q7: A 24-bit address bus allows how many directly addressable locations?

a) 2 X 24
b) 26
c) 224

Q8: Explain why an increase in the width of the address bus have an affect on system
performance?

Q9: A memory upgrade involves:

a) Redesign of the address bus


b) Insertion of additional memory modules
c) Replacement of the motherboard

Q10: How would you calculate the number of memory locations that could be directly
addressed using a 16-bit address bus?

3.4 Summary
The following summary points are related to the learning objectives in the topic
introduction:

Indicators of computer performance include: clock speed, MIPS, FLOPS;


Performance can be evaluated using these indicators;


Benchmark testing is used to measure performance;


Other factors affecting performance include: data bus width, cache memory and
data transfer rates;

Increasing clock speed, memory and storage capacity may improve performance.

3.5 End of topic test


An online assessment is provided to help you review this topic.

c

H ERIOT-WATT U NIVERSITY 2005

You might also like