MOD 3
MOD 3
Dr. C.R.Dhivyaa
Assistant Professor
School of Computer Science and Engineering
Vellore Institute of Technology, Vellore
Computer Instructions: Instruction sets
• Instruction
• Is a statement by which the operation of CPU is determined.
• These instructions referred as “Machine instructions or computer Instructions”
0 0 0 1 1 0 1 0
Buffer
0 0 0 1 1 0 1 0
0 0 0 1 1 0 1 0
Buffer
Buffer
Buffer
• Rotate left through carry
0 0 0 1 1 0 1 0
0 0 0 1 1 0 1 0 0
Buffer Carry
Buffer Carry
Buffer Carry
• Rotate right through carry
0 0 0 1 1 0 1 0
0 0 0 0 1 1 0 1 0
Carry Buffer
Note:
1. One Address and Zero address Instruction
(Don’t use registers)
2. Zero address →Operation (add) does not require
explicit operand addresses because it operates
implicitly using a stack.
Calculation of Memory traffic
• Assumptions
• 24-bit memory address (3 bytes)
• 128 instructions (7 bits rounded to 1 byte)
Memory Required to store an Instruction: 5 x 3 bytes = 15 Bytes
4 – address Instruction Design
is fetched. = 3
• Then the two words representing the operands themselves need to be fetched into he CPU and after the addition
has been performed, Result overwrites Operand =3
• Total= 06
• add Op1Addr Op2Addr
Memory Required to store an Instruction: 2 x 3 bytes = 06 Bytes
36 30 12 9 21
Two-address
Memory Memory M/As to M/As to Memory
load a, b ab
to Store to encode Fetch Execute Traffic
add a, c aa+c
3*3=9 1+(2*3)=7 3 2 3+2=5
mpy a, d aa*d
sub a, e aa-e 3*3=9 1+(2*3)=7 3 3 3+3=6
36 28 12 11 23
Memory Memory M/As to M/As to Memory
One-address to Store to encode Fetch Execute Traffic
2*3=6 1+(1*3)=4 2 1 2+1=3
load b Accb
2*3=6 1+(1*3)=4 2 1 2+1=3
add c AccAcc+c
2*3=6 1+(1*3)=4 2 1 2+1=3
mpy d AccAcc*d
2*3=6 1+(1*3)=4 2 1 2+1=3
sub e AccAcc-e
2*3=6 1+(1*3)=4 2 1 2+1=3
store a aAcc
30 20 10 5 15
• Develop a comparative table for the performance parameters such as memory to store, memory to
encode, M/As to fetch , M/As to execute and total memory Traffic for 4-,3-,2-,1-,0- address
machine instructions. Consider the following specifications: Memory word size is 1 byte,
Memory/register Address size is 2byte, Opcode size is 1 byte.
ii. Compute various performance factors such as memory to store a program, memory to encode a
whole program, Memory access to fetch & execute and memory traffic.
Addressing Modes
• Effective Address
• The address in which the actual operand is available is called as Effective address
Addressing Modes
• Terminologies
• Displacement: It is an 8 bit or 16 bit immediate value given in the
instruction.
• Base: Contents of base register, BX or BP.
• Index: Content of index register SI or DI.
Classification of Addressing Modes
1. Stack (Implied/Implicit) Addressing mode
2. Immediate Addressing mode 9. Auto Increment Addressing Mode
3. (Memory) Direct Addressing mode 10. Auto Decrement Addressing Mode
4. Register Direct Addressing mode
5. (Memory) Indirect Addressing mode
6. Register Indirect Addressing mode
7. Displacement Addressing modes
1. Indexed Addressing mode
2. Base register Addressing mode
3. Relative Addressing Mode
1.Stack (Implied/Implicit) Addressing
mode
• Definition of the instruction itself specify the operands implicitly.
• Operand is implied / specified implicitly in the instruction itself.
• Operations like PUSH and POP for the computation
• Zero address instructions in a stack organized computer are implied mode instructions.
• Effective Address (EA) = AC or Stack[SP]
Opcode
• Advantage:
• Instruction specifies a fixed and unvarying address DEC (Decrement A register)
• No memory references CLC (used to reset Carry flag to 0)
PUSH
• Disadvantage:
POP
• Limited Computational Capacity
2.Immediate Addressing mode
• The simplest form of addressing
• Effective Address (EA) = Value
• Data is a part of instruction itself.
• Example:
Opcode Operand
• MOVE #100, R1
• Here the data 100 is moved to R1.
• MVI #01, A
• MVI stands for Move Immediate. This basically implies move 01 to A.
• Advantage:
• This mode can be used to define and use constants or set of initial values of variables.
• No memory references
• Disadvantage:
• Limited Operand size
3. (Memory) Direct Addressing mode
• The address where data is available is part Advantage:
of the instruction
• Large operand Magnitude, Simple
• The address field contains the effective
address of the operand Disadvantage:
• Limited Address Size
• Effective Address (EA) = LOC
• The change in the location of the
• Example:
program is associated with the change in
• MOVE A, R1 all absolute memory references.
• Here the data in
memory location A
is moved to R1.
4. Register Direct Addressing mode
• Register addressing is similar to Advantage:
direct addressing • No memory Reference
• The only difference is that the address field Disadvantage:
refers to a register rather than a main memory
address • Limited number of registers
• Effective Address (EA) = Ri
• Example:
• MOVE R2, R1
• Here the data in Register R2 is
moved to R1.
5. (Memory) Indirect Addressing mode
• The address field contains the address of Advantage:
effective address of the operand
• Contains a full –length address of the • Large address space
operand Disadvantage:
• Effective Address (EA) = (LOC) or [LOC] • The change in the location of the
• Example: program is associated with the change in
• MOVE (A), R1 all absolute memory references
• Here A – has another memory address
(not data)
• The data in the
address in A
is moved to R1.
6. Register Indirect Addressing mode
• Register indirect is just analogous to Advantage:
indirect addressing in both cases • Large address space
• The only difference is whether the address Disadvantage:
field refers to memory location or a register.
• Extra memory space
• Effective Address (EA) = (Ri) or [Ri]
• Example:
• MOVE (R2), R1
• Here R2 – has memory
address (not data)
• The data in the
address in R2
is moved to R1.
7. Displacement Addressing modes -
Indexed Addressing mode
• The address field reference a main memory The base register holds the beginning location of
Index
a memory array, while the index register holds
address, and the referenced register contain a Register
the relative position of an element in the array.
positive displacement from that address . Advantage:
• Effective Address (EA) = (Ri) + X Index value of an array• Special Locality
Stored in
• Example:
• MOVE 20 (R2), R1 Index
Register
• Here R2 – has memory
address (not data).
• The address in R2 is added with the index value
20 which is the EA.
• The data in the
address in R2+20
is moved to R1.
Register - AC ← R1 400
• Two Phases:
– Fetch
– Execute
Phases of Instruction Cycle
• 4 phases of Instruction Cycle
Phases of Instruction Cycle
• Fetch Phase:
• PC – holds the address of Instruction
• Processor – Fetches the instruction from memory
and stores in IR
• Increment PC
• Unless told Otherwise
• Processor interprets instruction and performs
required actions
• Execute Phase:
• Carry out the actions specified by the instruction in
the IR (execution phase).
• The instruction decoder and control logic unit is
responsible for implementing the action specified
by the instruction loaded in the IR
Phases of Instruction Cycle
• An instruction can be executed by performing one
or more of the following operations in some
specified sequence.
• Transfer a word of data from one processor
register to another or to the ALU.
• Perform an arithmetic or a logic operation and
store the result in a processor register.
• Fetch the contents of a given memory location
and load them into a processor register.
• Store a word of data from a processor register
into a given memory location.
Instruction Cycle - State Diagram
• Instruction address calculation (IAC):
• Determine the address of the next instruction
to be executed. Adding a fixed number to a
next number.
• Instruction fetch: (IF)
• Read the instruction from its memory location
into the processor.
• Instruction operation decoding (IOD)
• Analyze instruction to determine type of
operation to be performed and operand(s) to
be used.
• Operand Address Calculation: (OAC)
• If the operation involves the reference to an • Data Operation (DO):
operand in memory or available via I/O, then • Perform the operation indicated in the
determine the address of the operand. instruction.
• Operand Fetch (OF): • Operand store (OS):
• Fetch the operand from memory or read it • Write the result into memory or out to I/O.
from I/O.
Interrupts
Instruction cycle with Interrupts
Instruction cycle with Interrupts – State
Diagram
Instruction Execution Cycle
ALU
• Arithmetic-Logic Unit (ALU) is the part of a CPU that carries out arithmetic and
logic operations on the operands.
• ALU is divided into two units:
• Arithmetic Unit (AU)
• Logic Unit (LU).
• Some processors contain more than one AU
• For example, one for fixed-point operations and another for floating-point
operations.
• Control Unit (CU) - supplies the data required by the ALU from memory, or from
input devices, and directs the ALU to perform a specific operation based on the
instruction fetched from the memory
ALU
Operations on ALU
• logical operations − These include operations like AND, OR, NOT, XOR, NOR,
NAND, etc.
Flags
Module 3 – Data Path and Control Unit
-Hardwired Control
- Microprogrammed Control
Datapath and Control
• CPU can be divided into Data section & Control Section
Control Section
issues control
signals to the
datapath
Recap : Data Path and Control
Data Path and Control
• To execute an instruction, a processor must perform the following 3
steps:
Data Path and Control – Single Bus
Register Transfer
ALU Operation
R=B
IRin
Instruction
Execute MARin
QUIZ
Revisit - Stages of Data Path
Stages of Data Path - Examples
• This means that for two computers X and Y, if the performance of X is greater
than the performance of Y, we have
PerformanceA = n × PerformanceB
Performance Metrics - Example
PerformanceX = n × PerformanceY
Performance Metrics
• For some program running on machine X,
1
Performance =
Execution time(X)
• "X is n times faster than Y" – represented as
Performance(X)
=n
Performance(Y)
• Problem: Machine A runs a program in 20 seconds. Machine B runs the
same program in 25 seconds. How many times faster is machine A?
Computer Clock
• Almost all computers are constructed using a clock that determines when
events take place in the hardware.
• These discrete time intervals are called clock cycles (ticks, clock ticks, clock
periods, clocks, cycles).
clock period
• Designers refer to the length of a clock period in time for a complete clock
cycle (e.g., 250 picoseconds) and as the clock rate/frequency (e.g. 4 GHz, 5
MHz), which is the inverse of the clock period.
Computer Clock
• Clock cycle time - the amount of time for one clock period to elapse
(e.g. 5 ns, 250 picoseconds….).
• Clock rate/frequency- inverse of the clock cycle time.
• For example, if a computer has a clock cycle time of 5 ns, the clock rate is:
1
---------------------- = 200 MHz
5 x 10-9 sec
clock period
Processor Performance Equation
• Performance Equation:
• Alternatively
• Clock cycles per instruction (CPI), which is the average number of clock
cycles for each instruction takes to execute, is often abbreviated as CPI.
Processor Performance Equation
• To Summarize,
1
𝐶𝑙𝑜𝑐𝑘 𝑅𝑎𝑡𝑒 =
𝑐𝑙𝑜𝑐𝑘 𝑐𝑦𝑐𝑙𝑒 𝑇𝑖𝑚𝑒
Machine A is faster
Practice Problems
• Example 4:
Our favorite program runs in 10 seconds on computer A, which has a 2
GHz clock. We are trying to help a computer designer build a computer,
B, which will run this program in 6 seconds. The designer has
determined that a substantial increase in the clock rate is possible, but
this increase will affect the rest of the CPU design, causing computer B
to require 1.2 times as many clock cycles as computer A for this
program. What clock rate should we tell the designer to target?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
A – Exec time = 10 sec, Clock rate – 2GHz
B- Exec.time = 6 sec.
• Example 4: Solution Clock cycle of B = 1.2 (clock cycle of A)
Find Clock rate of B?
Practice Problems
• Example 5:
σ𝑛
𝑖=1 𝐼C𝑖 ∗𝐶𝑃𝐼𝑖
• Total CPI=
Instruction count(Ic)
Practice Problems
• Example 6:
Practice Problems
• Example 6: Solution
B is faster
Since Clock cycle time is not given, we can estimate using CPU clock cycles itself
Practice Problems
• Example 6: Solution
Practice Problems
• Example 7:
Practice Problems
• Example 7: Solution
With Frequency
• Ic → Instruction count 𝐼𝑐 ∗ 𝑓
• T→ CPU time CPU clock cycles ∗ 106
• f→Clock rate
• CPI→ Cycles Per Instruction
Practice Problems
• Example 10:
Practice Problems
• Example 11:
Practice Problems
• Example 12:
Assume that a benchmark has 100 instructions with the clock rate of
300Mhz. 20% instructions are loads/stores (each take 3 cycles), 40%
instructions are adds (each takes 2 cycles), and 40% instructions are
square root (each takes 60 cycles), what is the CPI and MIPS rate for
this benchmark?