0% found this document useful (0 votes)
3 views

MOD 3

The document discusses instruction sets and control units in computer architecture, detailing the elements of machine instructions, instruction formats, and categories based on operations performed. It explains instruction set architecture (ISA) as an interface between hardware and software, and categorizes instructions by the number of operand addresses and types of operations such as data movement, processing, and flow control. Additionally, it includes examples of memory traffic calculations for different instruction types and problems for evaluation.

Uploaded by

prakshalwork
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

MOD 3

The document discusses instruction sets and control units in computer architecture, detailing the elements of machine instructions, instruction formats, and categories based on operations performed. It explains instruction set architecture (ISA) as an interface between hardware and software, and categorizes instructions by the number of operand addresses and types of operations such as data movement, processing, and flow control. Additionally, it includes examples of memory traffic calculations for different instruction types and problems for evaluation.

Uploaded by

prakshalwork
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 143

BCSE205L

Computer Architecture and Organization


Module 3 – Instruction Sets and Control Unit

Dr. C.R.Dhivyaa
Assistant Professor
School of Computer Science and Engineering
Vellore Institute of Technology, Vellore
Computer Instructions: Instruction sets
• Instruction
• Is a statement by which the operation of CPU is determined.
• These instructions referred as “Machine instructions or computer Instructions”

• Elements of Machine instruction


• Operation code
• Source operand reference
• Result operand Reference
• Next instruction reference
Computer Instructions: Instruction sets
• Program
• is sequence of instructions which operates on data to perform certain tasks.
• Instruction
• Is a statement by which the operation of CPU is determined
• Instruction Cycle
Instruction Set Architecture
• Instruction Set Architecture (ISA) - defines how the CPU is controlled by the
software
• ISA acts as an interface between the hardware and the software, specifying both
what the processor is capable of doing as well as how it gets done
• It defines the supported data types, the registers, how the hardware manages main
memory, key features (such as virtual memory), which instructions a
microprocessor can execute, and the input/output model of multiple ISA
implementations
Instruction formats
• Each instruction is represented by sequence of bits
• The instruction is divided into two fields
• Opcode field
• Operand field
• This operand field further divided into one to four fields.
• This layout of the instruction is known as the “Instruction Format”
• Simple instruction format
Instruction Set category
• Instruction Set is categorized into types based on
• Operation performed
• Number of operand addresses
• Addressing modes.
Instruction Set category
• Based on Operation being performed
• Data movement / Transfer → Move data from a memory location or register to another
memory location or register without changing its form.
• Different data transfers:
• Memory ↔ processor registers
• Processor registers ↔ input or output
• Processor register ↔ processor register
• Load: transfer from memory to a processor register, usually an AC (memory read)
• Store: transfer from a processor register into memory (memory write)
• Move: transfer from one register to another register
• Exchange: swap information between two registers or a register and a memory word
• Input/Output: transfer data among processor registers and input/output device ( I/O
instructions)
• Push: transfer data between processor registers and a memory stack
• Pop : transfer data from stack to processor registers.
Instruction Set category
• Based on Operation being performed
• Data processing /Manipulation→ Arithmetic and logic (ALU) instructions - Changes
the form of one or more operands to produce a result stored in another location
I. Arithmetic,
II. Logical and bit manipulation,
III. Shift Instruction
Arithmetic instructions:
Instruction Set category
• Based on Operation being performed
Logical and Bit Manipulation
Instruction Set category
• Based on Operation being performed
Shift Instructions
Instruction Set category
• Based on Operation being performed
Logical shift left
Instruction Set category
• Based on Operation being performed
Logical shift Right
Instruction Set category
• Based on Operation being performed
Arithmetic shift left
Instruction Set category
• Based on Operation being performed
Arithmetic shift Right
• Rotate left

0 0 0 1 1 0 1 0

0 0 0 1 1 0 1 0 Rotate by 1 bit towards left


Buffer

0 0 1 1 0 1 0 0 After rotating two times

Buffer

After rotating three times


0 1 1 0 1 0 0 0
Buffer
• Rotate right

0 0 0 1 1 0 1 0

rotate by 1 bit towards right

0 0 0 1 1 0 1 0
Buffer

0 0 0 0 1 1 0 1 After rotating two times

Buffer

After rotating three times


1 0 0 0 0 1 1 0

Buffer
• Rotate left through carry

0 0 0 1 1 0 1 0

Rotate by 1 bit towards left

0 0 0 1 1 0 1 0 0
Buffer Carry

0 0 1 1 0 1 0 0 0 After rotating two times

Buffer Carry

After rotating three times


0 1 1 0 1 0 0 0 0

Buffer Carry
• Rotate right through carry

0 0 0 1 1 0 1 0

rotate by 1 bit towards right

0 0 0 0 1 1 0 1 0
Carry Buffer

0 0 0 0 0 1 1 0 1 After rotating two times


Carry Buffer

After rotating three times


0 1 0 0 0 0 1 1 0
Carry
Buffer
Instruction Set category
Operation Description
Name
• Based on Operation being
Jump Unconditional transfer
performed Unconditional
• Flow Control / Program control→ Any Jump Test specified condition
instruction that alters the normal flow of
control from executing the next instruction Jump to Jump to specified address
in sequence subroutine
Return Replace the content of PC
• Conditional
Execute Execute instructions
• JNZ, JZ
• Un Conditional Skip Increment PC to skip next Instruction

• Jump Skip condti Test conditon for skip


Halt Stop program execution
Wait (hold) Stop program execution and resume when condition
satisfied
No operation No operation performed but program execution
continued
Instruction Set category
• Based on Number of operand addresses
• Instruction Set categorized into four categories based on number of operand
address in the instruction.
• 4-Address Instruction
• 3-Address Instruction
• 2-Address Instruction
• 1-Address Instruction
• 0-Address Instruction
Instruction Set category
Three Address Instruction
Example:

Two Address Instruction


R1A
R1R1+B
R2C
R2R2+D
R1R1*R2
M[X]R1
Instruction Set category
One Address Instruction

Zero Address Instruction

Note:
1. One Address and Zero address Instruction
(Don’t use registers)
2. Zero address →Operation (add) does not require
explicit operand addresses because it operates
implicitly using a stack.
Calculation of Memory traffic
• Assumptions
• 24-bit memory address (3 bytes)
• 128 instructions (7 bits rounded to 1 byte)
Memory Required to store an Instruction: 5 x 3 bytes = 15 Bytes
4 – address Instruction Design

• Because of the large instruction word size and number of memory


accesses ,the 4- address machine and instruction format is not seen in
the machine design.

• Although the 4-address structure is used internally in some


implementations of computer control units. This kind of controller
implementations is known as microcoded Control.
Memory Required to store an Instruction: 4 x 3 bytes = 12 Bytes
3 – address Instruction Design
3-Address instruction:
• Address of next instruction kept in processor state register—the PC (Except for explicit Branches/Jumps)
• Rest of addresses in instruction
• This Instruction will require 3X3+1= 10 bytes to encode a 3-address ALU instruction.
The number of memory access are required for a 3-address instruction:
• Four words will be transferred to the CPU when the instruction itself is fetched.= 4
• Then the two words representing the operands themselves need to be fetched into the CPU = 2
• And after the addition has been performed, the result needs to be written back to memory = 1 Total =07
2 – address Instruction Design
2-address Instruction :
• Result overwrites Operand 2
• Needs only 2 addresses in instruction but less choice in placing data
• This Instruction will require 2X3+1= 7 bytes to encode a 2-address ALU instruction.
The number of memory access are required for a 2-address instruction:
• Three words will be transferred to the CPU when the instruction itself

is fetched. = 3
• Then the two words representing the operands themselves need to be fetched into he CPU and after the addition
has been performed, Result overwrites Operand =3
• Total= 06
• add Op1Addr Op2Addr
Memory Required to store an Instruction: 2 x 3 bytes = 06 Bytes

Result stored in Accumulator – Hence No Memory access required


Fetch Instruction – 2 cycles
Fetch Operand – 1 cycle
(Uses stack to store when PUSH & hence no store in memory

Fetch Instruction – 1 cycles, Fetch Operand & Execute – 0 cycle


(Uses stack to perform ADD & store the result in stack when ADD & No store in memory
Comparisons
Instruction Memory Memory M/As to fetch an M/As to Memory
Type To Store To Encode Instruction Execute an Traffic
in Bytes in Bytes Instruction

4-address 5 x 3 = 15 1+(4 x 3) = 13 5 3 5+3=8

3-Address 4 x 3 = 12 1+(3 x 3) = 10 4 3 4+3=7

2-Address 3 x 3 = 09 1+(2 x 3) = 07 3 3 3+3=6

1-Address 2 x 3 = 06 1+(1 x 3) = 04 2 1 2+1=3

0-Address 1 x 3 = 03 1+(0 x 3) = 01 1 (ADD..) (or) 0 (ADD..) (or) 1+0=1


2 –(PUSH..POP) 1 –(PUSH..POP)
Problems – Example1
• Evaluate a = (b+c)*d – e in 3-, 2-, 1-, 0- address machines and
compute the memory traffic. Assume 24 bit memory address and one
byte opcode.
3-address 2-address 1-address 0-address
Memory traffic for 3-address
Machine: 7 * 3 = 21(Maximum)
add a,b,c load a,b Load b Push b
Memory traffic for 2-address
mul a,a,d Add a,c Add c Push c
Machine: 6 * 4 = 24(Maximum)
sub a,a,e Mul a,d Mul d Add
Sub a,e Sub e Push d Memory traffic for 1-address
Machine: 3 * 5 = 15(Maximum)
Store a Mul
Push e Memory traffic for 0-address
Sub Machine: 3 * 5 + 3 = 18 (Maximum)
Pop a
Memory Memory M/As to M/As to Memory
Three-address to Store to encode Fetch Execute Traffic

add a, b, c ab+c 4*3=12 1+(3*3)=10 4 3 4+3=7


mpy a, a, d aa*d
4*3=12 1+(3*3)=10 4 3 4+3=7
sub a, a, e aa-e
4*3=12 1+(3*3)=10 4 3 4+3=7

36 30 12 9 21

Two-address
Memory Memory M/As to M/As to Memory
load a, b ab
to Store to encode Fetch Execute Traffic
add a, c aa+c
3*3=9 1+(2*3)=7 3 2 3+2=5
mpy a, d aa*d
sub a, e aa-e 3*3=9 1+(2*3)=7 3 3 3+3=6

3*3=9 1+(2*3)=7 3 3 3+3=6

3*3=9 1+(2*3)=7 3 3 3+3=6

36 28 12 11 23
Memory Memory M/As to M/As to Memory
One-address to Store to encode Fetch Execute Traffic
2*3=6 1+(1*3)=4 2 1 2+1=3
load b Accb
2*3=6 1+(1*3)=4 2 1 2+1=3
add c AccAcc+c
2*3=6 1+(1*3)=4 2 1 2+1=3
mpy d AccAcc*d
2*3=6 1+(1*3)=4 2 1 2+1=3
sub e AccAcc-e
2*3=6 1+(1*3)=4 2 1 2+1=3
store a aAcc
30 20 10 5 15

Zero-address Memory Memory M/As to M/As to Memory


to Store to encode Fetch Execute Traffic
push b 2*3=6 1+(1*3)=4 2 1 3
2*3=6 1+(1*3)=4 2 1 3
push c
1*3=3 1 1 0 1
add
2*3=6 1+(1*3)=4 2 1 3
push d
1*3=3 1 1 0 1
mpy
2*3=6 1+(1*3)=4 2 1 3
push e
1*3=3 1 1 0 1
sub
2*3=6 1+(1*3)=4 2 1 3
pop a
39 23 13 5 18
Problems – Example 2
Assume 24 bit memory address and one byte opcode.
Practice problems
• Evaluate the expression a = b - c * d and compute memory traffic for 4, 3, 2, 1,and 0 address
machine. Assuming that addresses are 16 bits, data values are 16 bits, opcodes are 8 bits and 1 byte
word length.

• Develop a comparative table for the performance parameters such as memory to store, memory to
encode, M/As to fetch , M/As to execute and total memory Traffic for 4-,3-,2-,1-,0- address
machine instructions. Consider the following specifications: Memory word size is 1 byte,
Memory/register Address size is 2byte, Opcode size is 1 byte.

i. Write an appropriate assembly language programming using 3-Address, 2-Address, 1-Address


and 0-address machine instructions for the following expression ( with registers & without
registers). Assume that all are integer operations.
X= (A / B + C * D) / (D * E - F + C / A) + G

ii. Compute various performance factors such as memory to store a program, memory to encode a
whole program, Memory access to fetch & execute and memory traffic.
Addressing Modes

• Instruction Set is categorized Based on Addressing Modes


• Addressing Mode
• The different ways in which the location of an operand is specified in an instruction are
referred to as addressing modes.
• The mode of access of effective address is called addressing mode
• The Way the operands are specified in the instruction
• The operation to be performed is indicated by the opcode.
• Operands can be in registers, memory or embedded in the instruction

• Effective Address
• The address in which the actual operand is available is called as Effective address
Addressing Modes
• Terminologies
• Displacement: It is an 8 bit or 16 bit immediate value given in the
instruction.
• Base: Contents of base register, BX or BP.
• Index: Content of index register SI or DI.
Classification of Addressing Modes
1. Stack (Implied/Implicit) Addressing mode
2. Immediate Addressing mode 9. Auto Increment Addressing Mode
3. (Memory) Direct Addressing mode 10. Auto Decrement Addressing Mode
4. Register Direct Addressing mode
5. (Memory) Indirect Addressing mode
6. Register Indirect Addressing mode
7. Displacement Addressing modes
1. Indexed Addressing mode
2. Base register Addressing mode
3. Relative Addressing Mode
1.Stack (Implied/Implicit) Addressing
mode
• Definition of the instruction itself specify the operands implicitly.
• Operand is implied / specified implicitly in the instruction itself.
• Operations like PUSH and POP for the computation
• Zero address instructions in a stack organized computer are implied mode instructions.
• Effective Address (EA) = AC or Stack[SP]
Opcode
• Advantage:
• Instruction specifies a fixed and unvarying address DEC (Decrement A register)
• No memory references CLC (used to reset Carry flag to 0)
PUSH
• Disadvantage:
POP
• Limited Computational Capacity
2.Immediate Addressing mode
• The simplest form of addressing
• Effective Address (EA) = Value
• Data is a part of instruction itself.
• Example:
Opcode Operand
• MOVE #100, R1
• Here the data 100 is moved to R1.
• MVI #01, A
• MVI stands for Move Immediate. This basically implies move 01 to A.
• Advantage:
• This mode can be used to define and use constants or set of initial values of variables.
• No memory references
• Disadvantage:
• Limited Operand size
3. (Memory) Direct Addressing mode
• The address where data is available is part Advantage:
of the instruction
• Large operand Magnitude, Simple
• The address field contains the effective
address of the operand Disadvantage:
• Limited Address Size
• Effective Address (EA) = LOC
• The change in the location of the
• Example:
program is associated with the change in
• MOVE A, R1 all absolute memory references.
• Here the data in
memory location A
is moved to R1.
4. Register Direct Addressing mode
• Register addressing is similar to Advantage:
direct addressing • No memory Reference
• The only difference is that the address field Disadvantage:
refers to a register rather than a main memory
address • Limited number of registers
• Effective Address (EA) = Ri
• Example:
• MOVE R2, R1
• Here the data in Register R2 is
moved to R1.
5. (Memory) Indirect Addressing mode
• The address field contains the address of Advantage:
effective address of the operand
• Contains a full –length address of the • Large address space
operand Disadvantage:
• Effective Address (EA) = (LOC) or [LOC] • The change in the location of the
• Example: program is associated with the change in
• MOVE (A), R1 all absolute memory references
• Here A – has another memory address
(not data)
• The data in the
address in A
is moved to R1.
6. Register Indirect Addressing mode
• Register indirect is just analogous to Advantage:
indirect addressing in both cases • Large address space
• The only difference is whether the address Disadvantage:
field refers to memory location or a register.
• Extra memory space
• Effective Address (EA) = (Ri) or [Ri]
• Example:
• MOVE (R2), R1
• Here R2 – has memory
address (not data)
• The data in the
address in R2
is moved to R1.
7. Displacement Addressing modes -
Indexed Addressing mode
• The address field reference a main memory The base register holds the beginning location of
Index
a memory array, while the index register holds
address, and the referenced register contain a Register
the relative position of an element in the array.
positive displacement from that address . Advantage:
• Effective Address (EA) = (Ri) + X Index value of an array• Special Locality
Stored in
• Example:
• MOVE 20 (R2), R1 Index
Register
• Here R2 – has memory
address (not data).
• The address in R2 is added with the index value
20 which is the EA.
• The data in the
address in R2+20
is moved to R1.

Example: Works for ARRAYS


7. Displacement Addressing modes –
Base register Addressing mode
The base register holds the beginning location of
• The address field reference a main memory Base
a memory array, while the index register holds
Register
address, and the referenced register contain a the relative position of an element in the array.
positive displacement from that address . Advantage:
• Effective Address (EA) = (Ri + BX) It use a convenient means of implementing
Base address of the segmentation.
• Example: array Stored Disadvantage:
• ADD AX, [BX+SI] Base Complexity Example: Works for ARRAYS
• ADD R1, (R2+R3) Register
• Meaning: R1 R1+M[R2+R3]
• Here R2 & BX – Base Register
• R3 & SI – Index register
• EA=sum (content of BR and SI)
ADD R1, (R2+3) --- Base
ADD R1,(R2+R3) --- Base with Index
ADD R1, 20(R2+R3) --- Base with Index and Offset
7. Displacement Addressing modes –
Relative Addressing Mode
• PC- relative addressing - the implicitly
Advantage:
referenced register is the program counter
(PC) • program-relative addressing is that the
code may be position-independent
• The effective address is the offset parameter
added to the address of the next instruction. Disadvantage:
• Effective Address (EA) = (PC) + X • Complexity
• Example:
• If Branch > 0,
JUMP -200
• Here, based on the condition,
the address in PC is incremented with the
constant value 200.
8. Auto Increment Addressing mode
• Register incremented after accessing memory Advantage:
• The effective address is the offset parameter • Useful while transferring large chunks of
added to the address of the next instruction. contiguous data.
• Effective Address (EA) = (Ri); Increment
• Example: Disadvantage:
• ADD (R2)+, R0
• Here R2 – has Operand Address
• Complexity
• After accessing the operand, the
register content is automatically
incremented.
9. Auto Decrement Addressing mode
• Register decremented and then contents Advantage:
accessed memory • Useful while transferring large chunks of
• The effective address is the offset parameter contiguous data.
added to the address of the next instruction.
Disadvantage:
• Effective Address (EA) = Decrement; (Ri)
• Complexity
• Example:
• ADD -(R2), R0
• Here R2 after decrement – has Operand
Address
• The contents of Register is decremented
first and then the content gives Operand
Address.
Addressing modes
Addressing modes
Problems
Find the effective address and the content of AC for the given data.
Addressing Mode Effective Content of AC
Address
Direct Address 500 AC ← (500) 800
Immediate operand 201 AC ← 500 500

Indirect address 800 AC ← ((500)) 300

Relative address 702 AC ← (PC + 500) 325

Indexed address 600 AC ← (XR + 500) 900

Register - AC ← R1 400

Register Indirect 400 AC ← (R1) 700


Autoincrement 400 AC ← (R1)+ 700
Autodecrement 399 AC ← -(R1) 450
Questions
• An instruction is stored at location 300 with its address field at location 301. The address
field has the value 400. A processor register R1 contains the number 200. Evaluate the
effective address if the addressing mode of the instruction is (a) direct; (b) immediate (c)
relative (d) register indirect; (e) index with R1 as the index register.
• Let the address stored in the program counter be designated by the symbol X1. The
instruction stored in X1 has the address part (operand reference) X2. The operand needed
to execute the instruction is stored in the memory word with address X3. An index
register contains the value X4. What is the relationship between these various quantities if
the addressing mode of the instruction is
• (a) direct (b) indirect (c) PC relative (d) indexed?
Module 3 – Phases of Instruction Cycle
ALU
Phases of Instruction Cycle

• The IAS operates repetitively performing an instruction cycle.

• Each instruction cycle consists of two sub cycles.

• Two Phases:

– Fetch

– Execute
Phases of Instruction Cycle
• 4 phases of Instruction Cycle
Phases of Instruction Cycle
• Fetch Phase:
• PC – holds the address of Instruction
• Processor – Fetches the instruction from memory
and stores in IR
• Increment PC
• Unless told Otherwise
• Processor interprets instruction and performs
required actions
• Execute Phase:
• Carry out the actions specified by the instruction in
the IR (execution phase).
• The instruction decoder and control logic unit is
responsible for implementing the action specified
by the instruction loaded in the IR
Phases of Instruction Cycle
• An instruction can be executed by performing one
or more of the following operations in some
specified sequence.
• Transfer a word of data from one processor
register to another or to the ALU.
• Perform an arithmetic or a logic operation and
store the result in a processor register.
• Fetch the contents of a given memory location
and load them into a processor register.
• Store a word of data from a processor register
into a given memory location.
Instruction Cycle - State Diagram
• Instruction address calculation (IAC):
• Determine the address of the next instruction
to be executed. Adding a fixed number to a
next number.
• Instruction fetch: (IF)
• Read the instruction from its memory location
into the processor.
• Instruction operation decoding (IOD)
• Analyze instruction to determine type of
operation to be performed and operand(s) to
be used.
• Operand Address Calculation: (OAC)
• If the operation involves the reference to an • Data Operation (DO):
operand in memory or available via I/O, then • Perform the operation indicated in the
determine the address of the operand. instruction.
• Operand Fetch (OF): • Operand store (OS):
• Fetch the operand from memory or read it • Write the result into memory or out to I/O.
from I/O.
Interrupts
Instruction cycle with Interrupts
Instruction cycle with Interrupts – State
Diagram
Instruction Execution Cycle
ALU
• Arithmetic-Logic Unit (ALU) is the part of a CPU that carries out arithmetic and
logic operations on the operands.
• ALU is divided into two units:
• Arithmetic Unit (AU)
• Logic Unit (LU).
• Some processors contain more than one AU
• For example, one for fixed-point operations and another for floating-point
operations.
• Control Unit (CU) - supplies the data required by the ALU from memory, or from
input devices, and directs the ALU to perform a specific operation based on the
instruction fetched from the memory
ALU
Operations on ALU
• logical operations − These include operations like AND, OR, NOT, XOR, NOR,
NAND, etc.

• Bit-Shifting Operations − This pertains to shifting the positions of the bits by a


certain number of places either towards the right or left, which is considered a
multiplication or division operations.

• Arithmetic operations − This refers to bit addition and subtraction


How ALU Works?
• ALU has direct input and output
access to
• processor controller,
• main memory (random access memory
or RAM in a personal computer)
• input/output devices
• Inputs and outputs flow along an
electronic path that is called a bus
ALU
• The ALU is that part of the computer that actually performs arithmetic and logical operations on
data
• All of the other elements of the computer system—control unit, registers, memory, I/O—are
there mainly to bring data into the ALU for it to process and then to take the results back out
• We have, in a sense, reached the core or essence of a computer when we consider the ALU
• An ALU and indeed, all electronic components in the computer, are based on the use of simple
digital logic devices that can store binary digits and perform simple Boolean logic operations
• Operands for arithmetic and logic operations are presented to the ALU in registers, and the
results of an operation are stored in registers
• These registers are temporary storage locations within the processor that are connected by signal
paths to the ALU
• The ALU may also set flags as the result of an operation
• For example, an overflow flag is set to 1 if the result of a computation exceeds the length of the
register into which it is to be stored
ALU
• The processor provides signals that control the operation of the ALU
and the movement of the data into and out of the ALU.

Flags
Module 3 – Data Path and Control Unit
-Hardwired Control
- Microprogrammed Control
Datapath and Control
• CPU can be divided into Data section & Control Section
Control Section
issues control
signals to the
datapath
Recap : Data Path and Control
Data Path and Control
• To execute an instruction, a processor must perform the following 3
steps:
Data Path and Control – Single Bus

Register Transfer

ALU Operation

Reading a word from memory

Storing a word in memory


Data Path and Control – Single Bus
Data Path and Control – Single Bus
Data Path and Control – Single Bus
Data Path and Control – Single Bus
Data Path and Control
– Multiple Bus
Data Path and Control
– Multiple Bus PCout

R=B

IRin

Instruction
Execute MARin
QUIZ
Revisit - Stages of Data Path
Stages of Data Path - Examples

jump-and-link (JAL) instruction is a


simple datapath that branches the PC
by a specified offset
Stages of Data Path - Examples
Stages of Data Path - Examples
Stages of Data Path - Examples
Control Unit
• Control Unit (CU) - supplies the data required by the ALU from
memory, or from input devices, and directs the ALU to perform a
specific operation based on the instruction fetched from the memory
• To execute instructions – processor – generates control signals in
proper sequence
• Two ways – to generate control signals in sequence
• Hardwired Control
• Microprogrammed Control
Hardwired Control
• Operates at high speed
• Little flexibility
Hardwired Control
Separate Decoder and Encoder
Hardwired Control
Microprogrammed Control

• Popular in CISC because complex instruction sets require complex


controllers that can more easily be implemented as microprograms.
Microprogrammed Control
1.Control Word: A control word is a word
whose individual bits represent various
control signals.
2.Micro-routine: A sequence of control words
corresponding to the control sequence of a
machine instruction constitutes the micro-
routine for that instruction.
3.Micro-instruction: Individual control words
in this micro-routine are referred to as
microinstructions.
4.Micro-program: A sequence of micro-
instructions is called a micro-program, which
is stored in a ROM or RAM called a Control
Memory (CM).
5.Control Store: the micro-routines for all
instructions in the instruction set of a
computer are stored in a special memory
called the Control Store.
Microprogrammed Control
Basic Organization of Microprogrammed
Control Unit
Microprogrammed Control
Microprogrammed Control – Branch Inst.
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Microprogrammed Control
Hardwired Vs Microprogrammed Control
Hardwired Control Unit Microprogrammed Control Unit
Microprogrammed control unit generates the control
Hardwired control unit generates the control signals
signals with the help of micro instructions stored in
needed for the processor using logic circuits
control memory
Hardwired control unit is faster when compared to
This is slower than the other as micro instructions are
microprogrammed control unit as the required control
used for generating signals here
signals are generated with the help of hardware.
Difficult to modify as the control signals that need to be Easy to modify as the modification need to be done only
generated are hard wired at the instruction level
More costlier as everything has to be realized in terms of Less costlier than hardwired control as only micro
logic gates instructions are used for generating control signals
It cannot handle complex instructions as the circuit
It can handle complex instructions
design for it becomes complex
Only limited number of instructions are used due to the
Control signals for many instructions can be generated
hardware implementation
Used in computer that makes use of Reduced Instruction Used in computer that makes use of Complex Instruction
Set Computers(RISC) Set Computers(CISC)
Module 3 – Performance Metrics :
Execution Time Calculation, MIPS, MFLOPS
Performance
• Measure Performance : How fast the computer works?

• Time – important metric to measure performance

• A computer exhibits higher performance if it executes programs faster.


Performance Metrics
• To maximize performance, we want to minimize response time or execution time
for task. Thus, we can relate performance and execution time for a computer X:

• This means that for two computers X and Y, if the performance of X is greater
than the performance of Y, we have

• That is, the execution time on y is longer than X, so X is faster than Y.


Performance Metrics - Example

PerformanceA = n × PerformanceB
Performance Metrics - Example

Hint : Performance (X) < Performance (Y)

PerformanceX = n × PerformanceY
Performance Metrics
• For some program running on machine X,
1
Performance =
Execution time(X)
• "X is n times faster than Y" – represented as
Performance(X)
=n
Performance(Y)
• Problem: Machine A runs a program in 20 seconds. Machine B runs the
same program in 25 seconds. How many times faster is machine A?
Computer Clock
• Almost all computers are constructed using a clock that determines when
events take place in the hardware.
• These discrete time intervals are called clock cycles (ticks, clock ticks, clock
periods, clocks, cycles).

clock period

• Designers refer to the length of a clock period in time for a complete clock
cycle (e.g., 250 picoseconds) and as the clock rate/frequency (e.g. 4 GHz, 5
MHz), which is the inverse of the clock period.
Computer Clock
• Clock cycle time - the amount of time for one clock period to elapse
(e.g. 5 ns, 250 picoseconds….).
• Clock rate/frequency- inverse of the clock cycle time.
• For example, if a computer has a clock cycle time of 5 ns, the clock rate is:
1
---------------------- = 200 MHz
5 x 10-9 sec
clock period
Processor Performance Equation
• Performance Equation:
• Alternatively

• Also, execution time - depend on the number of instructions in a program


• Execution time equals the number of instructions executed multiplied by
the average time per instruction. Therefore, the number of clock cycles
required for a program can be written as

• Clock cycles per instruction (CPI), which is the average number of clock
cycles for each instruction takes to execute, is often abbreviated as CPI.
Processor Performance Equation
• To Summarize,

1
𝐶𝑙𝑜𝑐𝑘 𝑅𝑎𝑡𝑒 =
𝑐𝑙𝑜𝑐𝑘 𝑐𝑦𝑐𝑙𝑒 𝑇𝑖𝑚𝑒

• CPI (cycles per instruction)


A floating point intensive application might have a higher CPI
MIPS & MFLOPS
Practice Problems
• Example 1:
• CPU clock rate is 1 MHz
• Program takes 45 million cycles to execute
• What’s the CPU time?

45,000,000 * (1 / 1,000,000) = 45 seconds


Practice Problems
• Example 2:
• CPU clock rate is 500 MHz
• Program takes 45 million cycles to execute
• What’s the CPU time?

45,000,000 * (1 / 500,000,000) = 0.09 seconds


Practice Problems
• Example 3:
Suppose we have two implementations of the same instruction set
architecture (ISA).
• For some program,
• Machine A has a clock cycle time of 10 ns. and a CPI of 2.0
• Machine B has a clock cycle time of 20 ns. and a CPI of 1.2
• Which machine is faster for this program, and by how much?
• Assume that # of instructions in the program is 1,000,000,000.
Practice Problems
Machine A has a clock cycle time of 10 ns. and a CPI of 2.0
• Example 3: Solution Machine B has a clock cycle time of 20 ns. and a CPI of 1.2
# of instructions - 1,000,000,000.

CPU / Execution TimeA = 109 * 2.0 * 10 * 10-9 = 20 seconds


CPU/ Execution TimeB = 109 * 1.2 * 20 * 10-9 = 24 seconds

= 24/20 = 1.2 times

Machine A is faster
Practice Problems
• Example 4:
Our favorite program runs in 10 seconds on computer A, which has a 2
GHz clock. We are trying to help a computer designer build a computer,
B, which will run this program in 6 seconds. The designer has
determined that a substantial increase in the clock rate is possible, but
this increase will affect the rest of the CPU design, causing computer B
to require 1.2 times as many clock cycles as computer A for this
program. What clock rate should we tell the designer to target?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
• Example 5:

Suppose we have two implementations of the same instruction set


architecture. Computer A has a clock cycle time of 250 ps and a CPI of
2.0 for some program, and computer B has a clock cycle time of 500 ps
and a CPI of 1.2 for the same program. Which computer is faster for
this program and by how much?
Practice Problems
• Example 5: Solution
Average Cycles per Instruction

σ𝑛
𝑖=1 𝐼C𝑖 ∗𝐶𝑃𝐼𝑖
• Total CPI=
Instruction count(Ic)
Practice Problems
• Example 6:
Practice Problems
• Example 6: Solution

B is faster
Since Clock cycle time is not given, we can estimate using CPU clock cycles itself
Practice Problems
• Example 6: Solution
Practice Problems
• Example 7:
Practice Problems
• Example 7: Solution
With Frequency

CPU clock cycles(c)


• CPI=
Instruction count(Ic)
σ𝑛
𝑖=1 𝐼C𝑖 ∗𝐶𝑃𝐼𝑖
• Total CPI=
Instruction count(Ic)
• The CPI is the average number of cycles per instruction.
• If for each instruction type, we know its frequency and number of
cycles need to execute it, we can compute the overall CPI as follows:
• CPI= σ𝑛𝑖=1 Freqi ∗ 𝐶𝑃𝐼𝑖
Practice Problems
• Example 8: Solution
Frequency
• Let assume that a benchmark has 100 instructions:
• 25 instructions are loads/stores (each take 2 cycles)
• 50 instructions are adds (each takes 1 cycle)
• 25 instructions are square root (each takes 50 cycles) CPI
• What is the CPI for this benchmark?

CPI = ((0.25 * 2) + (0.50 * 1) + (0.25 * 50))


= 13.5
Practice Problems
• Example 9:
Two different compilers are being tested for a 500 MHz. machine
with three different classes of instructions: Class A, Class B, and
Class C, which require one, two, and three cycles (respectively).
Both compilers are used to produce code for a large piece
of software.
The first compiler's code uses 5 billions Class A instructions, 1
billion Class B instructions, and 1 billion Class C instructions.
The second compiler's code uses 10 billions Class A instructions,
1 billion Class B instructions, and 1 billion Class C instructions.
• Which sequence will be faster according to MIPS?
• Which sequence will be faster according to execution time?
Practice Problems
• Example 9: Solution

Clock rate – 500 MHz


Cycles – class1-1 cycle,
class2-2 cycles,
class3-3 cycles.
Problems with Arithmetic Mean
• For example, two machines timed on two benchmarks:
Machine A Machine B
Program 1 2 seconds (20%) 6 seconds (20%)
Program 2 12 seconds (80%) 10 seconds (80%)

Average execution timeA = (2 + 12) / 2 = 7 seconds


Average execution timeB = (6 + 10) / 2 = 8 seconds

Weighted average execution timeA = 2*0.2 + 12*0.8 = 10 seconds


Weighted average execution timeB = 6*0.2 + 10*0.8 = 9.2 seconds
MIPS rate
• A common measure of performance for a processor is the
rate at which instructions are executed, expressed as millions
of instructions per second (MIPS), referred to as the MIPS
rate.

• Ic → Instruction count 𝐼𝑐 ∗ 𝑓
• T→ CPU time CPU clock cycles ∗ 106
• f→Clock rate
• CPI→ Cycles Per Instruction
Practice Problems
• Example 10:
Practice Problems
• Example 11:
Practice Problems
• Example 12:
Assume that a benchmark has 100 instructions with the clock rate of
300Mhz. 20% instructions are loads/stores (each take 3 cycles), 40%
instructions are adds (each takes 2 cycles), and 40% instructions are
square root (each takes 60 cycles), what is the CPI and MIPS rate for
this benchmark?

Ans: CPI=25.4 , MIPS=11.8


MFLOPS

• Floating-point performance is expressed as millions of floating-point


operations per second (MFLOPS), defined as follows:

You might also like