0% found this document useful (0 votes)
12 views

Coa Merged (2)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Coa Merged (2)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

COMPUTER ORGANIZATION

MODULE 1: BASIC STRUCTURE OF COMPUTERS

BASIC CONCEPTS
• Computer Architecture (CA) is concerned with the structure and behaviour of the computer.
• CA includes the information formats, the instruction set and techniques for addressing memory.
• In general covers, CA covers 3 aspects of computer-design namely: 1) Computer Hardware, 2)
Instruction set Architecture and 3) Computer Organization.
1. Computer Hardware
 It consists of electronic circuits, displays, magnetic and optical storage media and communication
facilities.
2. Instruction Set Architecture
 It is programmer visible machine interface such as instruction set, registers, memory organization
and exception handling.
 Two main approaches are 1) CISC and 2) RISC.
(CISCComplex Instruction Set Computer, RISCReduced Instruction Set Computer)
3. Computer Organization
 It includes the high level aspects of a design, such as
→ memory-system
→ bus-structure &
→ design of the internal CPU.
 It refers to the operational units and their interconnections that realize the architectural
specifications.
 It describes the function of and design of the various units of digital computer that store and process
information.

1.1 COMPUTER TYPES


A computer can be defined as a fast electronic calculating machine that accepts the (data)
digitized input information process it as per the list of internally stored instructions and produces the
resulting information.
List of instructions are called programs & internal storage is called computer memory.
The different types of computers are
1. Personal computers: - This is the most common type found in homes, schools, Business
offices etc., It is the most common type of desk top computers with processing and storage
units along with various input and output devices.
2. Note book computers: These are compact and portable versions of PC
3. Work stations: These have high resolution input/output (I/O) graphics capability, but with
same dimensions as that of desktop computer. These are used in engineering applications of
interactive design work.
4. Enterprise systems: These are used for business data processing in medium to large
corporations that require much more computing power and storage capacity than work stations.
Internet associated with servers have become a dominant worldwide source of all types of
information.
5. Super computers: These are used for large scale numerical calculations required in the
applications like weather forecasting etc.,

1
COMPUTER ORGANIZATION

FUNCTIONAL UNITS
• A computer consists of 5 functionally independent main parts:
1) Input
2) Memory
3) ALU
4) Output &
5) Control units.

Input device accepts the coded information as source program i.e. high level language. This is either
stored in the memory or immediately used by the processor to perform the desired operations. The program
stored in the memory determines the processing steps. Basically the computer converts one source program
to an object program. i.e. into machine language.

Finally the results are sent to the outside world through output device. All of these actions are
coordinated by the control unit.

Input unit: -
The source program/high level language program/coded information/simply data is fed to a computer
through input devices keyboard is a most common type. Whenever a key is pressed, one corresponding
word or number is translated into its equivalent binary code over a cable & fed either to memory or
processor.

Joysticks, trackballs, mouse, scanners etc are other input devices.

Memory unit: -
Its function into store programs and data. It is basically to two types

1. Primary memory
2. Secondary memory

1. Primary memory: - Is the one exclusively associated with the processor and operates at the electronics
speeds programs must be stored in this memory while they are being executed. The memory contains a
large number of semiconductors storage cells. Each capable of storing one bit of information. These are
processed in a group of fixed site called word.
2
COMPUTER ORGANIZATION

To provide easy access to a word in memory, a distinct address is associated with each word location.
Addresses are numbers that identify memory location.

Number of bits in each word is called word length of the computer. Programs must reside in the
memory during execution. Instructions and data can be written into the memory or read out under the
control of processor.

Memory in which any location can be reached in a short and fixed amount of time after specifying its
address is called random-access memory (RAM).

The time required to access one word in called memory access time. Memory which is only readable by
the user and contents of which can’t be altered is called read only memory (ROM) it contains operating
system.

Caches are the small fast RAM units, which are coupled with the processor and are aften contained on
the same IC chip to achieve high performance. Although primary storage is essential it tends to be
expensive.

2 Secondary memory: - Is used where large amounts of data & programs have to be stored, particularly
information that is accessed infrequently.

Examples: - Magnetic disks & tapes, optical disks (ie CD-ROM’s), floppies etc.,

Arithmetic logic unit (ALU):-


Most of the computer operators are executed in ALU of the processor like addition, subtraction,
division, multiplication, etc. the operands are brought into the ALU from memory and stored in high speed
storage elements called register. Then according to the instructions the operation is performed in the
required sequence.

The control and the ALU are may times faster than other devices connected to a computer system. This
enables a single processor to control a number of external devices such as key boards, displays, magnetic
and optical disks, sensors and other mechanical controllers.

Output unit:-
These actually are the counterparts of input unit. Its basic function is to send the processed results to the
outside world.

Examples:- Printer, speakers, monitor etc.

Control unit:-
It effectively is the nerve center that sends signals to other units and senses their states. The actual
timing signals that govern the transfer of data between input unit, processor, memory and output unit are
generated by the control unit.

3
COMPUTER ORGANIZATION

BASIC OPERATIONAL CONCEPTS


• An Instruction consists of 2 parts, 1) Operation code (Opcode) and 2) Operands.
OPCODE OPERANDS
• The data/operands are stored in memory.
• The individual instruction are brought from the memory to the processor.
• Then, the processor performs the specified operation.
• Let us see a typical instruction
ADD LOCA, R0
• This instruction is an addition operation. The following are the steps to execute the instruction:
Step 1: Fetch the instruction from main-memory into the processor.
Step 2: Fetch the operand at location LOCA from main-memory into the processor.
Step 3: Add the memory operand (i.e. fetched contents of LOCA) to the contents of register R0. Step 4:
Store the result (sum) in R0.
• The same instruction can be realized using 2 instructions as:
Load LOCA, R1 Add R1, R0
• The following are the steps to execute the instruction:
Step 1: Fetch the instruction from main-memory into the processor.
Step 2: Fetch the operand at location LOCA from main-memory into the register R1. Step 3: Add the
content of Register R1 and the contents of register R0.
Step 4: Store the result (sum) in R0.

4
COMPUTER ORGANIZATION

MAIN PARTS OF PROCESSOR


• The processor contains ALU, control-circuitry and many registers.
• The processor contains „n‟ general-purpose registers R0 through Rn-1.
• The IR holds the instruction that is currently being executed.
• The control-unit generates the timing-signals that determine when a given action is to take
place.
• The PC contains the memory-address of the next-instruction to be fetched & executed.
• During the execution of an instruction, the contents of PC are updated to point to next
instruction.
• The MAR holds the address of the memory-location to be accessed.
• The MDR contains the data to be written into or read out of the addressed location.
• MAR and MDR facilitates the communication with memory. (IR  Instruction-Register, PC 
Program Counter)
(MAR  Memory Address Register, MDR Memory Data Register)

STEPS TO EXECUTE AN INSTRUCTION


1) The address of first instruction (to be executed) gets loaded into PC.
2) The contents of PC (i.e. address) are transferred to the MAR & control-unit issues Read signal to
memory.
3) After certain amount of elapsed time, the first instruction is read out of memory and placed into
MDR.
4) Next, the contents of MDR are transferred to IR. At this point, the instruction can be decoded &
executed.
5) To fetch an operand, it's address is placed into MAR & control-unit issues Read signal. As a
result, the operand is transferred from memory into MDR, and then it is transferred from MDR to ALU.
6) Likewise required number of operands is fetched into processor.
7) Finally, ALU performs the desired operation.
8) If the result of this operation is to be stored in the memory, then the result is sent to the MDR.
9) The address of the location where the result is to be stored is sent to the MAR and a Write cycle
is initiated.
10) At some point during execution, contents of PC are incremented to point to next instruction in the
program.

5
COMPUTER ORGANIZATION

BUS STRUCTURE
• A bus is a group of lines that serves as a connecting path for several devices.
• A bus may be lines or wires.
• The lines carry data or address or control signal.
• There are 2 types of Bus structures: 1) Single Bus Structure and 2) Multiple Bus Structure.
1) Single Bus Structure
 Because the bus can be used for only one transfer at a time, only 2 units can actively use the bus at
any given time.
 Bus control lines are used to arbitrate multiple requests for use of the bus.
 Advantages:
1) Low cost &
2) Flexibility for attaching peripheral devices.
2) Multiple Bus Structure
 Systems that contain multiple buses achieve more concurrency in operations.
 Two or more transfers can be carried out at the same time.
 Advantage: Better performance.
 Disadvantage: Increased cost.

• The devices connected to a bus vary widely in their speed of operation.


• To synchronize their operational-speed, buffer-registers can be used.
• Buffer Registers
→ are included with the devices to hold the information during transfers.
→ prevent a high-speed processor from being locked to a slow I/O device during data transfers.

PERFORMANCE
• The most important measure of performance of a computer is how quickly it can execute
programs.
• The speed of a computer is affected by the design of
1) Instruction-set.
2) Hardware & the technology in which the hardware is implemented.
3) Software including the operating system.
• Because programs are usually written in a HLL, performance is also affected by the compiler
that translates programs into machine language. (HLL High Level Language).
• For best performance, it is necessary to design the compiler, machine instruction set and
hardware in a co-ordinated way.

6
COMPUTER ORGANIZATION

examine the flow of program instructions and data between the memory & the processor.
• At the start of execution, all program instructions are stored in the main-memory.
• As execution proceeds, instructions are fetched into the processor, and a copy is placed in the
cache.
• Later, if the same instruction is needed a second time, it is read directly from the cache.
• A program will be executed faster
if movement of instruction/data between the main-memory and the processor is minimized which is
achieved by using the cache.

PROCESSOR CLOCK
• Processor circuits are controlled by a timing signal called a Clock.
• The clock defines regular time intervals called Clock Cycles.
• To execute a machine instruction, the processor divides the action to be performed into a
sequence of basic steps such that each step can be completed in one clock cycle.
• Let P = Length of one clock cycle R = Clock rate.
• Relation between P and R is given by

• R is measured in cycles per second.


• Cycles per second is also called Hertz (Hz)

BASIC PERFORMANCE EQUATION


• Let T = Processor time required to executed a program. N = Actual number of instruction executions.
S = Average number of basic steps needed to execute one machine instruction. R = Clock rate in cycles
per second.
• The program execution time is given by

------(1)
• Equ1 is referred to as the basic performance equation.
 To achieve high performance, the computer designer must reduce the value of T, which means N
The value of N is reduced if source program is compiled into fewer machine instructions.
 The value of S is reduced if instructions have a smaller number of basic steps to perform.
 The value of R can be increased by using a higher frequency clock.
• Care has to be taken while modifying values since changes in one parameter may affect the
other.

7
COMPUTER ORGANIZATION

Problem 1:
List the steps needed to execute the machine instruction:
Load R2, LOC
in terms of transfers between the components of processor and some simple control commands. Assume
that the address of the memory-location containing this instruction is initially in register PC. Solution:
1. Transfer the contents of register PC to register MAR.
2. Issue a Read command to memory.
And, then wait until it has transferred the requested word into register MDR.
3. Transfer the instruction from MDR into IR and decode it.
4. Transfer the address LOCA from IR to MAR.
5. Issue a Read command and wait until MDR is loaded.
6. Transfer contents of MDR to the ALU.
7. Transfer contents of R0 to the ALU.
8. Perform addition of the two operands in the ALU and transfer result into R0.
9. Transfer contents of PC to ALU.
10. Add 1 to operand in ALU and transfer incremented address to PC.

Problem 2:
List the steps needed to execute the machine instruction:
Add R4, R2, R3
in terms of transfers between the components of processor and some simple control commands. Assume
that the address of the memory-location containing this instruction is initially in register PC. Solution:
1. Transfer the contents of register PC to register MAR.
2. Issue a Read command to memory.
And, then wait until it has transferred the requested word into register MDR.
3. Transfer the instruction from MDR into IR and decode it.
4. Transfer contents of R1 and R2 to the ALU.
5. Perform addition of two operands in the ALU and transfer answer into R3.
6. Transfer contents of PC to ALU.
7. Add 1 to operand in ALU and transfer incremented address to PC.

Problem 3:
(a) Give a short sequence of machine instructions for the task “Add the contents of memory-location A
to those of location B, and place the answer in location C”. Instructions:
Load Ri, LOC and
Store Ri, LOC
are the only instructions available to transfer data between memory and the general purpose registers.
Add instructions are described in Section 1.3. Do not change contents of either location A or B.
(b) Suppose that Move and Add instructions are available with the formats:
Move Location1, Location2 and
Add Location1, Location2
These instructions move or add a copy of the operand at the second location to the first location,
overwriting the original operand at the first location. Either or both of the operands can be in the memory
or the general-purpose registers. Is it possible to use fewer instructions of these types to accomplish the
task in part (a)? If yes, give the sequence.

Solution:

8
COMPUTER ORGANIZATION

(a)
Load A, R0 Load B, R1 Add R0, R1
Store R1, C
(b) Yes;
Move B, C Add A, C

Problem 4:
A program contains 1000 instructions. Out of that 25% instructions requires 4 clock cycles,40%
instructions requires 5 clock cycles and remaining require 3 clock cycles for execution. Find the total time
required to execute the program running in a 1 GHz machine.
Solution:
N = 1000
25% of N= 250 instructions require 4 clock cycles.
40% of N =400 instructions require 5 clock cycles. 35% of N=350 instructions require 3 clock cycles.
T = (N*S)/R= (250*4+400*5+350*3)/1X109 =(1000+2000+1050)/1*109= 4.05 μs.

Problem 5:
For the following processor, obtain the performance.
Clock rate = 800 MHz
No. of instructions executed = 1000
Average no of steps needed / machine instruction = 20
Solution:

Problem 6
(a) Suppose that execution time for a program is proportional to instruction fetch time. Assume that
fetching an instruction from the cache takes 1 time unit, but fetching it from the main-memory takes 10
time units. Also, assume that a requested instruction is found in the cache with probability 0.96. Finally,
assume that if an instruction is not found in the cache it must first be fetched from the main- memory into
the cache and then fetched from the cache to be executed. Compute the ratio of program execution time
without the cache to program execution time with the cache. This ratio is called the speedup resulting from
the presence of the cache.
(b) If the size of the cache is doubled, assume that the probability of not finding a requested
instruction there is cut in half. Repeat part (a) for a doubled cache size.
Solution:
(a) Let cache access time be 1 and main-memory access time be 20. Every instruction that is executed
must be fetched from the cache, and an additional fetch from the main-memory must be performed for 4%
of these cache accesses.
Therefore,

(b)

9
COMPUTER ORGANIZATION

MACHINE INSTRUCTIONS & PROGRAMS

Numbers, Arithmatic Operations, and Characters:


Computers are built using logic circuits that operate on information represented by two electrical signals
0v and +5v and these values are represented as 0’s and 1’s.The amount of information represented by such
as a bit of information, where bit stands for binary digit. The most natural way to represent a number in a
computer system is by a string of bits, called a binary number. A text character can also be represented by a
string of bits called a character code.

NUMBER REPRESENTATION:

Consider an n-bit vector

We obviously need to represent both positive and negative numbers. Three systems are used
for representing such numbers :

•Sign-and-magnitude
•1's-complement
•2's-complement

In all three systems, the leftmost bit is 0 for positive numbers and 1 for negative numbers. Fig 2.1
illustrates all three representations using 4-bit numbers. Positive values have identical representations in al
systems, but negative values have different representations. In the sign-andmagnitude systems, negative
values are represented by changing the most significant bit (b in figure 2.1) from 0 to 1 in the B vector of
the corresponding positive value. For example, +5 is represented by 0101, and -5 is represented by 1101. In
1's- complement representation, negative values are obtained by complementing each bit of the
corresponding positive number. Thus, the representation for -3 is obtained by complementing each bit in
the vector 0011 to yield 1100. clearly, the same operation, bit complementing, is done in converting a
negative number to the corresponding positive value. Converting either way is referred to as forming the
1's-complement of a given number. Finally, in the 2's-complement system, forming the 2's-complement of
a number is done by subtracting that number from 2n.

10
COMPUTER ORGANIZATION

Addition of Positive numbers:-


Consider adding two 1-bit numbers. The results are shown in figure 2.2. Note that the sum of 1 and 1
requires the 2-bit vector 10 to represent the value 2. We say that the sum is 0 and the carry-out is 1. In
order to add multiple-bit numbers, we use a method analogous to that used for manual computation with
decimal numbers. We add bit pairs starting from the low-order (right) and of the bit vectors, propagating
carries toward the high-order (left) end.

0 1 0 1
+0 +0 +1 +1
____ ____ ___ ___
0 1 1 10

Carry-out
Figure 2.2 Addition of 1-bit numbers.

Addition and Subtraction of Signed Integers

We introduced three systems for representing positive and negative numbers, or, simply, signed
numbers. These systems differ only in the way they represent negative values. Their relative merits from
the standpoint of ease of performing arithmetic operations can be summarized as follows. The sign-and-
magnitude system is the simplest representation, but it is also the most awkward for addition and
subtraction operations. The 1’s-complement method is somewhat better. The 2’s-complement system is the
most efficient method for performing addition and subtraction operations.
To understand 2’s-complement arithmetic, consider addition modulo N (abbreviated as mod N ). A
helpful graphical device for the description of addition of unsigned integersmod N is a circle with the
values 0 through N 1 marked along its perimeter. Consider the case N = 16, shown in part (b) of the figure.
The decimal values 0 through 15 are represented by their 4-bit binary values 0000 through 1111 around the
11
COMPUTER ORGANIZATION

outside of the circle. In terms of decimal values, the operation (7 + 5) mod 16 yields the value 12. To
perform this operation graphically, locate 7 (0111) on the outside of the circle and then move 5 units in the
clockwise direction to arrive at the answer 12 (1100).

Similarly, (9 + 14) mod 16 = 7; this is modeled on the circle by locating 9 (1001) and
moving 14 units in the clockwise direction past the zero position to arrive at the answer
7 (0111). This graphical technique works for the computation of (a + b) mod 16 for any
unsigned integers a and b; that is, to perform addition, locate a and move b units in the
clockwise direction to arrive at (a + b) mod 16.

Now consider a different interpretation of the mod 16 circle. We will reinterpret the
binary vectors outside the circle to represent the signed integers from 8 through + 7 in the
2’s-complement representation as shown inside the circle.

Let us apply the mod 16 addition techniques to the example of adding +7 to -3.

The 2’s-complement representation for these numbers is 0111 and 1101, respectively. To add
these numbers, locate 0111 on the circle in Figure 1.5b. Then move 1101 (13) steps in the
clockwise direction to arrive at 0100, which yields the correct answer of + 4. Note that the
2’s-complement representation of 3 is interpreted as an unsigned value for the number of
steps to move.

If we perform this addition by adding bit pairs from right to left, we obtain
0111
+ 1101
1 0100
↑Carry-out
If we ignore the carry-out from the fourth bit position in this addition, we obtain the correct
answer. In fact, this is always the case. Ignoring this carry-out is a natural result of using
mod N arithmetic. As we move around the circle in Figure 1.5b, the value next to 1111
would normally be 10000. Instead, we go back to the value 0000.

The rules governing addition and subtraction of n-bit signed numbers using the 2’scomplement
representation system may be stated as follows:

• To add two numbers, add their n-bit representations, ignoring the carry-out bit from
the most significant bit (MSB) position. The sum will be the algebraically correct value in
2’s-complement representation if the actual result is in the range -2 n-1 through + 2n-1 − 1.

• To subtract two numbers X and Y , that is, to perform X – Y , form the 2’s-complement
of Y , then add it to X using the add rule. Again, the result will be the algebraically correct
value in 2’s-complement representation if the actual result is in the range -2 n-1 through
+ 2 n-1 −1.
12
COMPUTER ORGANIZATION

Overflow in Integer Arithmetic:

Using 2’s-complement representation, n bits can represent values in the range -2 n-1 to +2n-1 - 1

For example, the range of numbers that can be represented by 4 bits is 8 through +7. When the actual
result of an arithmetic operation is outside the representable range, an arithmetic overflow has occurred.

When adding unsigned numbers, a carry-out of 1 from the most significant bit position
indicates that an overflow has occurred. However, this is not always true when adding signed
numbers. For example, using 2’s-complement representation for 4-bit signed numbers, if
we add + 7 and +4, the sum vector is 1011, which is the representation for -5, an incorrect
result. In this case, the carry-out bit from the MSB position is 0. If we add 4 and 6, we get
0110 = +6, also an incorrect result. In this case, the carry-out bit is 1.

Hence, the value of the carry-out bit from the sign-bit position is not an indicator of overflow.
Clearly, overflow may occur only if both summands have the same sign. The addition of
numbers with different signs cannot cause overflow because the result is always within the
representable range.

These observations lead to the following way to detect overflow when adding two
numbers in 2’s-complement representation. Examine the signs of the two summands and
the sign of the result. When both summands have the same sign, an overflow has occurred
when the sign of the sum is not the same as the signs of the summands.
When subtracting two numbers, the testing method needed for detecting overflow has
to be modified somewhat; but it is still quite straightforward.

Floating-Point Numbers and Operations:

The descriptions provided here are based on the 2008 version of IEEE (Institute of Electrical and
Electronics Engineers) Standard 754, labeled 754-2008 [4].
A binary floating-point number can be represented by
 A sign for the number
 Some significant bits
 A signed scale factor exponent for an implied base of 2

The basic IEEE format is a 32-bit representation, shown in Figure 9.26a. The leftmost bit represents the
sign, S, for the number. The next 8 bits, E , represent the signed exponent of the scale factor (with an
implied base of 2), and the remaining 23 bits, M , are the fractional part of the significant bits. The full 24-
bit string, B, of significant bits, called the mantissa, always has a leading 1, with the binary point
immediately to its right. Therefore, the mantissa

13
COMPUTER ORGANIZATION

By convention, when the binary point is placed to the right of the first significant bit, the number is said
to be normalized. Note that the base, 2, of the scale factor and the leading 1 of the mantissa are both fixed.
They do not need to appear explicitly in the representation. Instead of the actual signed exponent, E, the
value stored in the exponent field is an unsigned integer E = E + 127. This is called the excess-127 format.
Thus, E is in the range 0 ≤ E ≤ 255. The end values of this range, 0 and 255, are used to represent special
values, as described later. Therefore, the range of E for normal values is 1 ≤ E ≤ 254. This means that the
actual exponent, E, is in the range −126 ≤ E ≤ 127. The use of the excess-127 representation for exponents
simplifies comparison of the relative sizes of two floating-point numbers. (See Problem 9.23.) The 32-bit
standard representation in Figure 9.26a is called a single-precision representation because it occupies a
single 32-bit word. The scale factor has a range of 2−126 to 2+127, which is approximately equal to 10±38.
The 24-bit mantissa provides approximately the same precision as a 7-digit decimal value. An example of a
single-precision floating-point number is shown in Figure 9.26b. To provide more precision and range for
floating-point numbers, the IEEE standard also specifies a double-precision format, as shown in Figure
9.26c. The double-precision format has increased exponent and mantissa ranges. The 11-bit excess-1023
exponent E has the range 1 ≤ E ≤ 2046 for normal values, with 0 and 2047 used to indicate special values,
as before. Thus, the actual exponent E is in the range −1022 ≤ E ≤ 1023, providing scale factors of 2−1022
to 21023 (approximately 10±308). The 53-bit mantissa provides a precision equivalent to about 16 decimal
digits. A computer must provide at least single-precision representation to conform to the IEEE standard.
Double-precision representation is optional. The standard also specifies certain optional extended versions
of both of these formats. The extended versions provide increased precision and increased exponent range
for the representation of intermediate values in a sequence of calculations. The use of extended formats
helps to reduce the size of the accumulated round-off error in a sequence of calculations leading to a
14
COMPUTER ORGANIZATION

desired result. For example, the dot product of two vectors of numbers involves accumulating a sum of
products. The input vector components are given in a standard precision, either single or double, and the
final answer (the dot product) is truncated to the same precision. All intermediate calculations should be
done using extended precision to limit accumulation of errors. Extended formats also enhance the accuracy
of evaluation of elementary functions such as sine, cosine, and so on. This is because they are usually
evaluated by adding up a number of terms in a series representation. In addition to requiring the four basic
arithmetic operations, the standard requires three additional operations to be provided: remainder, square
root, and conversion between binary and decimal representations.

We note two basic aspects of operating with floating-point numbers. First, if a number is not
normalized, it can be put in normalized form by shifting the binary point and adjusting the exponent.
Figure 9.27 shows an unnormalized value, 0.0010110 ... × 29, and its normalized version, 1.0110 ... × 26.
Since the scale factor is in the form 2i , shifting the mantissa right or left by one bit position is compensated
by an increase or a decrease of 1 in the exponent, respectively. Second, as computations proceed, a number
that does not fall in the representable range of normal numbers might be generated. In single precision, this
means that its normalized representation requires an exponent less than −126 or greater than +127. In the
first case, we say that underflow has occurred, and in the second case, we say that overflow has occurred.

Special Values:

The end values 0 and 255 of the excess-127 exponent E are used to represent special values. When E = 0
and the mantissa fraction M is zero, the value 0 is represented. When E = 255 and M = 0, the value ∞ is
represented, where ∞ is the result of dividing a normal number by zero. The sign bit is still used in these
representations, so there are representations for ±0 and ±∞. When E = 0 and M = 0, denormal numbers are
represented. Their value is ±0.M × 2−126. Therefore, they are smaller than the smallest normal number.
There is no implied one to the left of the binary point, and M is any nonzero 23-bit fraction. The purpose of
introducing denormal numbers is to allow for gradual underflow, providing an extension of the range of
normal representable numbers. This is useful in dealing with very small numbers, which may be needed in
certain situations. When E = 255 and M = 0, the value represented is called Not a Number (NaN). A NaN
represents the result of performing an invalid operation such as 0/0 or √−1.

Exceptions :

15
COMPUTER ORGANIZATION

In conforming to the IEEE Standard, a processor must set exception flags if any of the following
conditions arise when performing operations: underflow, overflow, divide by zero, inexact, invalid. We
have already mentioned the first three. Inexact is the name for a result that requires rounding in order to be
represented in one of the normal formats. An invalid exception occurs if operations such as 0/0 or √−1 are
attempted. When an exception occurs, the result is set to one of the special values. If interrupts are enabled
for any of the exception flags, system or user-defined routines are entered when the associated exception
occurs. Alternatively, the application program can test for the occurrence of exceptions, as necessary, and
decide how to proceed

MEMORY-LOCATIONS & ADDRESSES


• Memory consists of many millions of storage cells (flip-flops).
• Each cell can store a bit of information i.e. 0 or 1 (Figure 2.1).
• Each group of n bits is referred to as a word of information, and n is called the word length.
• The word length can vary from 8 to 64 bits.
• A unit of 8 bits is called a byte.
• Accessing the memory to store or retrieve a single item of information (word/byte) requires
distinct addresses for each item location. (It is customary to use numbers from 0 through 2k-1 as the
addresses of successive-locations in the memory).
• If 2k = no. of addressable locations;
then 2k addresses constitute the address-space of the computer.
For example, a 24-bit address generates an address-space of 224 locations (16 MB).

16
COMPUTER ORGANIZATION

BYTE-ADDRESSABILITY
• In byte-addressable memory, successive addresses refer to successive byte locations in the
memory.
• Byte locations have addresses 0, 1, 2. . . . .
• If the word-length is 32 bits, successive words are located at addresses 0, 4, 8. . with each word
having 4 bytes.

BIG-ENDIAN & LITTLE-ENDIAN ASSIGNMENTS


• There are two ways in which byte-addresses are arranged (Figure 2.3).
1) Big-Endian: Lower byte-addresses are used for the more significant bytes of the word.
2) Little-Endian: Lower byte-addresses are used for the less significant bytes of the word
•............................................................ In both cases, byte-addresses 0, 4, 8. are taken as the
addresses of successive words in the
memory.

17
COMPUTER ORGANIZATION

Consider a 32-bit integer (in hex): 0x12345678 which consists of 4 bytes: 12, 34, 56, and 78.
 Hence this integer will occupy 4 bytes in memory.
 Assume, we store it at memory address starting 1000.
 On little-endian, memory will look like
Address Value
1000 78
1001 56
1002 34
1003 12

 On big-endian, memory will look like


Address Value

1000 12
1001 34
1002 56

1003 78

WORD ALIGNMENT
• Words are said to be Aligned in memory if they begin at a byte-address that is a multiple of the
number of bytes in a word.
• For example,
 If the word length is 16(2 bytes), aligned words begin at byte-addresses 0, 2, 4 . . . . .
 If the word length is 64(2 bytes), aligned words begin at byte-addresses 0, 8, 16 . . . . .
• Words are said to have Unaligned Addresses, if they begin at an arbitrary byte-address.

ACCESSING NUMBERS, CHARACTERS & CHARACTERS STRINGS


• A number usually occupies one word. It can be accessed in the memory by specifying its word
address. Similarly, individual characters can be accessed by their byte-address.
• There are two ways to indicate the length of the string:
1) A special control character with the meaning "end of string" can be used as the last character in the
string.
2) A separate memory word location or register can contain a number indicating the length of the
string in bytes.

MEMORY OPERATIONS
• Two memory operations are:
1) Load (Read/Fetch) &
2) Store (Write).
• The Load operation transfers a copy of the contents of a specific memory-location to the
processor. The memory contents remain unchanged.
• Steps for Load operation:
1) Processor sends the address of the desired location to the memory.
18
COMPUTER ORGANIZATION

2) Processor issues „read‟ signal to memory to fetch the data.


3) Memory reads the data stored at that address.
4) Memory sends the read data to the processor.
• The Store operation transfers the information from the register to the specified memory-location.
This will destroy the original contents of that memory-location.
• Steps for Store operation are:
1) Processor sends the address of the memory-location where it wants to store data.
2) Processor issues „write‟ signal to memory to store the data.
3) Content of register(MDR) is written into the specified memory-location.

INSTRUCTIONS & INSTRUCTION SEQUENCING


• A computer must have instructions capable of performing 4 types of operations:
1) Data transfers between the memory and the registers (MOV, PUSH, POP, XCHG).
2) Arithmetic and logic operations on data (ADD, SUB, MUL, DIV, AND, OR, NOT).
3) Program sequencing and control (CALL.RET, LOOP, INT).
4) I/0 transfers (IN, OUT).

REGISTER TRANSFER NOTATION (RTN)

• The possible locations in which transfer of information occurs are: 1) Memory-location 2)


Processor register & 3) Registers in I/O device.

Loc Hardware Binary Example Description


ation Address
Me LOC, PLACE, NUM R1  Contents of memory-
mory [LOC] location LOC are
transferred into register R1.
Proc R0, R1 ,R2 [R3]  Add the contents of
essor [R1]+[R2] register R1 &R2 and places
their sum into R3.
I/O DATAIN, DATAOUT R1  Contents of I/O register
Registe DATAIN DATAIN are transferred
rs into register R1.

ASSEMBLY LANGUAGE NOTATION

• To represent machine instructions and programs, assembly language format is used.

Assembly Language Description


Format
Move LOC, R1 Transfer data from memory-location LOC to register R1. The
contents of LOC
are unchanged by the execution of this instruction, but the old
contents of register R1 are overwritten.
Add R1, R2, R3 Add the contents of registers R1 and R2, and places their sum
into register R3.

19
COMPUTER ORGANIZATION

BASIC INSTRUCTION TYPES

In Syntax Example Description Instructions


for
struc Operation
tion C<-[A]+[B]
Type
Add the contents
of memory-
Three Opcode Add
Source1,Source2,Destination A,B,C locations A & B.
Address Then, place the result
into location C.
Two Opcode Source, Destination Add Add the contents Move B, C
Address A,B of
memory-locations A &
B. Add A, C
Then, place the result
into
location B, replacing the
original contents
of this
Location.
Operand B is both a
source
and a destination.
One Opcode Source/Destination Load A Copy contents of Load A
Address Memory-
location A into
accumulator. Add B
Add B Add contents of memory-
Store C
location B to contents of
accumulator register
&
place sum back into
accumulator.
Store C Copy the contents of the
accumulator into location
C.
Locations of all operands
Zero Opcode are defined implicitly. Not
[no Source/Destination] Push
Address The operands are stored possible
in a pushdown stack.

20
COMPUTER ORGANIZATION

 Access to data in the registers is much faster than to data stored in memory-locations.
. Let Ri represent a general-purpose register. The instructions:

Load A,Ri
Store Ri,A
Add A,Ri

are generalizations of the Load, Store and Add Instructions for the single-accumulator case, in which
register Ri performs the function of the accumulator.

• In processors, where arithmetic operations as allowed only on operands that are in registers, the
task C<-[A]+[B] can be performed by the instruction sequence:
Move A,Ri Move B,Rj Add Ri,Rj Move Rj,C

INSTRUCTION EXECUTION & STRAIGHT LINE SEQUENCING

• The program is executed as follows:


1) Initially, the address of the first instruction is loaded into PC (Figure 2.8).
2) Then, the processor control circuits use the information in the PC to fetch and execute
instructions, one at a time, in the order of increasing addresses. This is called Straight-Line sequencing.
3) During the execution of each instruction, PC is incremented by 4 to point to next instruction.

• There are 2 phases for Instruction Execution:


1) Fetch Phase: The instruction is fetched from the memory-location and placed in the IR.
2) Execute Phase: The contents of IR is examined to determine which operation is to be performed.
The specified-operation is then performed by the processor.

21
COMPUTER ORGANIZATION

Program Explanation
• Consider the program for adding a list of n numbers (Figure 2.9).
• The Address of the memory-locations containing the n numbers are symbolically given as
NUM1, NUM2…..NUMn.
• Separate Add instruction is used to add each number to the contents of register R0.
• After all the numbers have been added, the result is placed in memory-location SUM.

BRANCHING
• Consider the task of adding a list of „n‟ numbers (Figure 2.10).
• Number of entries in the list „n‟ is stored in memory-location N.
• Register R1 is used as a counter to determine the number of times the loop is executed.
• Content-location N is loaded into register R1 at the beginning of the program.
• The Loop is a straight line sequence of instructions executed as many times as needed. The loop
starts at location LOOP and ends at the instruction Branch>0.
• During each pass,
→ address of the next list entry is determined and
→ that entry is fetched and added to R0.
• The instruction Decrement R1 reduces the contents of R1 by 1 each time through the loop.
• Then Branch Instruction loads a new value into the program counter. As a result, the processor
fetches and executes the instruction at this new address called the Branch Target.

A Conditional Branch Instruction causes a branch only if a specified condition is satisfied. If the
condition is not satisfied, the PC is incremented in the normal way, and the next instruction in sequential
address order is fetched and executed.
22
COMPUTER ORGANIZATION

23
COMPUTER ORGANIZATION

MODULE 2:

GENERATING MEMORY ADDRESSES:-


The purpose of the instruction block at LOOP is to add a different number from the list during each
pass through the loop. Hence, the Add instruction in the block must refer to a different address during each
pass. How are the addresses to be specified ? The memory operand address cannot be given directly in a
single Add instruction in the loop. Otherwise, it would need to be modified on each pass through the loop.

The instruction set of a computer typically provides a number of such methods, called addressing
modes. While the details differ from one computer to another, the underlying concepts are the same.

ADDRESSING MODES:-
In general, a program operates on data that reside in the computer’s memory. These data can be
organized in a variety of ways. If we want to keep track of students’ names, we can write them in a list.
Programmers use organizations called data structures to represent the data used in computations. These
include lists, linked lists, arrays, queues, and so on.

Programs are normally written in a high-level language, which enables the programmer to use
constants, local and global variables, pointers, and arrays. The different ways in which the location of an
operand is specified in an instruction are referred to as addressing modes.

24
COMPUTER ORGANIZATION

IMPLEMENTATION OF VARIABLE AND CONSTANTS

• Variable is represented by allocating a memory-location to hold its value.


• Thus, the value can be changed as needed using appropriate instructions.
• There are 2 accessing modes to access the variables:
1) Register Mode
2) Absolute Mode

Register Mode
• The operand is the contents of a register.
• The name (or address) of the register is given in the instruction.
• Registers are used as temporary storage locations where the data in a register are accessed.
• For example, the instruction
Move R1, R2 ;Copy content of register R1 into register R2.

Absolute (Direct) Mode


• The operand is in a memory-location.
• The address of memory-location is given explicitly in the instruction.
• The absolute mode can represent global variables in the program.
• For example, the instruction
Move LOC, R2 ;Copy content of memory-location LOC into register R2.

Immediate Mode
• The operand is given explicitly in the instruction.
• For example, the instruction
Move #200, R0 ;Place the value 200 in register R0.
• Clearly, the immediate mode is only used to specify the value of a source-operand.

INDIRECTION AND POINTERS

• Instruction does not give the operand or its address explicitly.


• Instead, the instruction provides information from which the new address of the operand can be
determined.
• This address is called Effective Address (EA) of the operand.

25
COMPUTER ORGANIZATION

Indirect Mode
• The EA of the operand is the contents of a register(or memory-location).
• The register (or memory-location) that contains the address of an operand is called a Pointer.
• We denote the indirection by
→ name of the register or
→ new address given in the instruction.
E.g: Add (R1),R0 ;The operand is in memory. Register R1 gives the effective-address (B) of the
operand. The data is read from location B and added to contents of register R0.

• To execute the Add instruction in fig 2.11 (a), the processor uses the value which is in register
R1, as the EA of the operand.
• It requests a read operation from the memory to read the contents of location B. The value read is
the desired operand, which the processor adds to the contents of register R0.
• Indirect addressing through a memory-location is also possible as shown in fig 2.11(b). In this
case, the processor first reads the contents of memory-location A, then requests a second read operation
using the value B as an address to obtain the operand

26
COMPUTER ORGANIZATION

Program Explanation
• In above program, Register R2 is used as a pointer to the numbers in the list, and the operands
are accessed indirectly through R2.
• The initialization-section of the program loads the counter-value n from memory-location N into
R1 and uses the immediate addressing-mode to place the address value NUM1, which is the address of the
first number in the list, into R2. Then it clears R0 to 0.
• The first two instructions in the loop implement the unspecified instruction block starting at
LOOP.
• The first time through the loop, the instruction Add (R2), R0 fetches the operand at location
NUM1 and adds it to R0.
• The second Add instruction adds 4 to the contents of the pointer R2, so that it will contain the
address value NUM2 when the above instruction is executed in the second pass through the loop.

INDEXING AND ARRAYS


• A different kind of flexibility for accessing operands is useful in dealing with lists and arrays.

Index mode
• The operation is indicated as X(Ri)
where X=the constant value which defines an offset(also called a displacement).
Ri=the name of the index register which contains address of a new location.
• The effective-address of the operand is given by EA=X+[Ri]
• The contents of the index-register are not changed in the process of generating the effective-
address.
• The constant X may be given either
→ as an explicit number or
→ as a symbolic-name representing a numerical value.

• Fig(a) illustrates two ways of using the Index mode. In fig(a), the index register, R1, contains the
address of a memory-location, and the value X defines an offset(also called a displacement) from this
address to the location where the operand is found.
• To find EA of operand: Eg: Add 20(R1), R2
EA=>1000+20=1020
• An alternative use is illustrated in fig(b). Here, the constant X corresponds to a memory address,
and the contents of the index register define the offset to the operand. In either case, the effective-address is
the sum of two values; one is given explicitly in the instruction, and the other is stored in a register.

27
COMPUTER ORGANIZATION

Base with Index Mode


• Another version of the Index mode uses 2 registers which can be denoted as (Ri, Rj)
• Here, a second register may be used to contain the offset X.
• The second register is usually called the base register.
• The effective-address of the operand is given by EA=[Ri]+[Rj]
• This form of indexed addressing provides more flexibility in accessing operands because both
components of the effective-address can be changed.
Base with Index & Offset Mode
• Another version of the Index mode uses 2 registers plus a constant, which can be denoted as
X(Ri, Rj)
• The effective-address of the operand is given by EA=X+[Ri]+[Rj]
• This added flexibility is useful in accessing multiple components inside each item in a record,
where the beginning of an item is specified by the (Ri, Rj) part of the addressing-mode. In other words, this
mode implements a 3-dimensional array.

RELATIVE MODE
• This is similar to index-mode with one difference:
The effective-address is determined using the PC in place of the general purpose register Ri.
• The operation is indicated as X(PC).
• X(PC) denotes an effective-address of the operand which is X locations above or below the
current contents of PC.
• Since the addressed-location is identified "relative" to the PC, the name Relative mode is
associated with this type of addressing.
• This mode is used commonly in conditional branch instructions.
• An instruction such as
Branch > 0 LOOP ;Causes program execution to go to the branch target location identified by
name LOOP if branch condition is satisfied.

28
COMPUTER ORGANIZATION

ADDITIONAL ADDRESSING MODES


1) Auto Increment Mode
 Effective-address of operand is contents of a register specified in the instruction (Fig: 2.16).
 After accessing the operand, the contents of this register are automatically incremented to point to
the next item in a list.
 Implicitly, the increment amount is 1.
 This mode is denoted as
(Ri)+ ;where Ri=pointer-register.
2) Auto Decrement Mode
 The contents of a register specified in the instruction are first automatically decremented and are
then used as the effective-address of the operand.
 This mode is denoted as
-(Ri) ;where Ri=pointer-register.
 These 2 modes can be used together to implement an important data structure called a stack.

ASSEMBLY LANGUAGE
• We generally use symbolic-names to write a program.
• A complete set of symbolic-names and rules for their use constitute an Assembly Language.
• The set of rules for using the mnemonics in the specification of complete instructions and
programs is called the Syntax of the language.
• Programs written in an assembly language can be automatically translated into a sequence of
machine instructions by a program called an Assembler.
• The user program in its original alphanumeric text formal is called a Source Program, and the
assembled machine language program is called an Object Program.
For example:
MOVE R0,SUM ;The term MOVE represents OP code for operation performed by instruction.
ADD #5,R3 ;Adds number 5 to contents of register R3 & puts the result back into registerR3.

ASSEMBLER DIRECTIVES
• Directives are the assembler commands to the assembler concerning the program being
assembled.

29
COMPUTER ORGANIZATION

These commands are not translated into machine opcode in the object-program.
• EQU informs the assembler about the value of an identifier (Figure: 2.18).
Ex: SUM EQU 200 ;Informs assembler that the name SUM should be replaced by the value 200.
• ORIGIN tells the assembler about the starting-address of memory-area to place the data block.
Ex: ORIGIN 204 ;Instructs assembler to initiate data-block at memory-locations starting from 204.
• DATAWORD directive tells the assembler to load a value into the location.
Ex: N DATAWORD 100 ;Informs the assembler to load data 100 into the memory-location N(204).

• RESERVE directive is used to reserve a block of memory.


Ex: NUM1 RESERVE 400 ;declares a memory-block of 400 bytes is to be reserved for data.
• END directive tells the assembler that this is the end of the source-program text.
• RETURN directive identifies the point at which execution of the program should be terminated.
• Any statement that makes instructions or data being placed in a memory-location may be
given a
label. The label(say N or NUM1) is assigned a value equal to the address of that location.

GENERAL FORMAT OF A STATEMENT


• Most assembly languages require statements in a source program to be written in the form:

L O O C
a pera per omm
b tion and ent
el s

1) Label is an optional name associated with the memory-address where the machine language
instruction produced from the statement will be loaded.
2) Operation Field contains the OP-code mnemonic of the desired instruction or assembler.
3) Operand Field contains addressing information for accessing one or more operands, depending
on the type of instruction.
4) Comment Field is used for documentation purposes to make program easier to understand.

30
COMPUTER ORGANIZATION

ASSEMBLY AND EXECUTION OF PROGRAMS

• Programs written in an assembly language are automatically translated into a sequence of


machine instructions by the Assembler.

• Assembler Program
→ replaces all symbols denoting operations & addressing-modes with binary-codes used in machine
instructions.
→ replaces all names and labels with their actual values.
→ assigns addresses to instructions & data blocks, starting at address given in ORIGIN directive
→ inserts constants that may be given in DATAWORD directives.
→ reserves memory-space as requested by RESERVE directives.

• Two Pass Assembler has 2 passes:


1) First Pass: Work out all the addresses of labels.
 As the assembler scans through a source-program, it keeps track of all names of numerical- values
that correspond to them in a symbol-table.
2) Second Pass: Generate machine code, substituting values for the labels.
 When a name appears a second time in the source-program, it is replaced with its value from the
table.
• The assembler stores the object-program on a magnetic-disk. The object-program must be loaded
into the memory of the computer before it is executed. For this, a Loader Program is used.

• Debugger Program is used to help the user find the programming errors.
• Debugger program enables the user
→ to stop execution of the object-program at some points of interest &
→ to examine the contents of various processor-registers and memory-location.

BASIC INPUT/OUTPUT OPERATIONS


• Consider the problem of moving a character-code from the keyboard to the processor (Figure:
2.19). For this transfer, buffer-register DATAIN & a status control flags(SIN) are used.
• When a key is pressed, the corresponding ASCII code is stored in a DATAIN register associated
with the keyboard.
 SIN=1  When a character is typed in the keyboard. This informs the processor that a valid
character is in DATAIN.
 SIN=0  When the character is transferred to the processor.
• An analogous process takes place when characters are transferred from the processor to the
display. For this transfer, buffer-register DATAOUT & a status control flag SOUT are used.
 SOUT=1  When the display is ready to receive a character.
 SOUT=0  When the character is being transferred to DATAOUT.
• The buffer registers DATAIN and DATAOUT and the status flags SIN and SOUT are part of

31
COMPUTER ORGANIZATION

circuitry commonly known as a device interface.

MEMORY-MAPPED I/O

• Some address values are used to refer to peripheral device buffer-registers such as DATAIN &
DATAOUT.
• No special instructions are needed to access the contents of the registers; data can be transferred
between these registers and the processor using instructions such as Move, Load or Store.
• For example, contents of the keyboard character buffer DATAIN can be transferred to register
R1 in the processor by the instruction
MoveByte DATAIN,R1

• The MoveByte operation code signifies that the operand size is a byte.
• The Testbit instruction tests the state of one bit in the destination, where the bit position to be
tested is indicated by the first operand.

STACKS
• A stack is a special type of data structure where elements are inserted from one end and elements
are deleted from the same end. This end is called the top of the stack (Figure: 2.14).
• The various operations performed on stack:
1) Insert: An element is inserted from top end. Insertion operation is called push operation.
2) Delete: An element is deleted from top end. Deletion operation is called pop operation.
• A processor-register is used to keep track of the address of the element of the stack that is at the
top at any given time. This register is called the Stack Pointer (SP).
• If we assume a byte-addressable memory with a 32-bit word length,
1) The push operation can be implemented as
Subtract #4, SP Move NEWITEM, (SP)
2) The pop operation can be implemented as
Move (SP), ITEM Add #4, SP

32
COMPUTER ORGANIZATION

• Routine for a safe pop and push operation as follows:

33
COMPUTER ORGANIZATION

QUEUE
• Data are stored in and retrieved from a queue on a FIFO basis.
• Difference between stack and queue?
1) One end of the stack is fixed while the other end rises and falls as data are pushed and popped.
2) In stack, a single pointer is needed to keep track of top of the stack at any given time.
In queue, two pointers are needed to keep track of both the front and end for removal and insertion
respectively.
3) Without further control, a queue would continuously move through the memory of a computer in
the direction of higher addresses. One way to limit the queue to a fixed region in memory is to use a
circular buffer.

SUBROUTINES
• A subtask consisting of a set of instructions which is executed many times is called a
Subroutine.
• A Call instruction causes a branch to the subroutine (Figure: 2.16).
• At the end of the subroutine, a return instruction is executed
• Program resumes execution at the instruction immediately following the subroutine call
• The way in which a computer makes it possible to call and return from subroutines is referred to
as its Subroutine Linkage method.
• The simplest subroutine linkage method is to save the return-address in a specific location, which
may be a register dedicated to this function. Such a register is called the Link Register.
• When the subroutine completes its task, the Return instruction returns to the calling-program by
branching indirectly through the link-register.
• The Call Instruction is a special branch instruction that performs the following operations:
→ Store the contents of PC into link-register.
→ Branch to the target-address specified by the instruction.
• The Return Instruction is a special branch instruction that performs the operation:
→ Branch to the address contained in the link-register.

34
COMPUTER ORGANIZATION

SUBROUTINE NESTING AND THE PROCESSOR STACK


• Subroutine Nesting means one subroutine calls another subroutine.
• In this case, the return-address of the second call is also stored in the link-register, destroying its
previous contents.
• Hence, it is essential to save the contents of the link-register in some other location before calling
another subroutine. Otherwise, the return-address of the first subroutine will be lost.
• Subroutine nesting can be carried out to any depth. Eventually, the last subroutine called
completes its computations and returns to the subroutine that called it.
• The return-address needed for this first return is the last one generated in the nested call
sequence. That is, return-addresses are generated and used in a LIFO order.
• This suggests that the return-addresses associated with subroutine calls should be pushed onto a
stack. A particular register is designated as the SP(Stack Pointer) to be used in this operation.
• SP is used to point to the processor-stack.
• Call instruction pushes the contents of the PC onto the processor-stack.
Return instruction pops the return-address from the processor-stack into the PC.

PARAMETER PASSING
• The exchange of information between a calling-program and a subroutine is referred to as
Parameter Passing (Figure: 2.25).
• The parameters may be placed in registers or in memory-location, where they can be accessed by
the subroutine.
• Alternatively, parameters may be placed on the processor-stack used for saving the return-
address.
• Following is a program for adding a list of numbers using subroutine with the parameters passed
through registers.

35
COMPUTER ORGANIZATION

ADDITIONAL INSTRUCTIONS:

LOGIC INSTRUCTIONS
• Logic operations such as AND, OR, and NOT applied to individual bits.
• These are the basic building blocks of digital-circuits.
• This is also useful to be able to perform logic operations is software, which is done using
instructions that apply these operations to all bits of a word or byte independently and in parallel.
• For example, the instruction
Not dst

SHIFT AND ROTATE INSTRUCTIONS


• There are many applications that require the bits of an operand to be shifted right or left some
specified number of bit positions.
• The details of how the shifts are performed depend on whether the operand is a signed number or
some more general binary-coded information.
• For general operands, we use a logical shift.
For a number, we use an arithmetic shift, which preserves the sign of the number.

LOGICAL SHIFTS
• Two logical shift instructions are
1) Shifting left (LShiftL) &
2) Shifting right (LShiftR).
• These instructions shift an operand over a number of bit positions specified in a count operand
contained in the instruction.

36
COMPUTER ORGANIZATION

ROTATE OPERATIONS
• In shift operations, the bits shifted out of the operand are lost, except for the last bit shifted out
which is retained in the Carry-flag C.
• To preserve all bits, a set of rotate instructions can be used.
• They move the bits that are shifted out of one end of the operand back into the other end.
• Two versions of both the left and right rotate instructions are usually provided. In one version,
the bits of the operand is simply rotated.
In the other version, the rotation includes the C flag.

37
COMPUTER ORGANIZATION

Problem 1:
Write a program that can evaluate the expression A*B+C*D In a single-accumulator processor. Assume
that the processor has Load, Store, Multiply, and Add instructions and that all values fit in the accumulator
Solution:
A program for the expression is: Load A
Multiply B Store RESULT Load C Multiply D Add RESULT
Store RESULT

Problem 2:
Registers R1 and R2 of a computer contains the decimal values 1200 and 4600. What is the effective-
address of the memory operand in each of the following instructions?
(a) Load 20(R1), R5
(b) Move #3000,R5
(c) Store R5,30(R1,R2)
(d) Add -(R2),R5
(e) Subtract (R1)+,R5
Solution:
(a) EA = [R1]+Offset=1200+20 = 1220
(b) EA = 3000
(c) EA = [R1]+[R2]+Offset = 1200+4600+30=5830
(d) EA = [R2]-1 = 4599
(e) EA = [R1] = 1200

Problem 3:
Registers R1 and R2 of a computer contains the decimal values 2900 and 3300. What is the effective-
address of the memory operand in each of the following instructions?
(a) Load R1,55(R2)
(b) Move #2000,R7
(c) Store 95(R1,R2),R5
(d) Add (R1)+,R5
(e) Subtract-(R2),R5
Solution:
a) Load R1,55(R2)  This is indexed addressing mode. So EA = 55+R2=55+3300=3355.
b) Move #2000,R7  This is an immediate addressing mode. So, EA = 2000
c) Store 95(R1,R2),R5  This is a variation of indexed addressing mode, in which contents of 2
registers are added with the offset or index to generate EA. So, 95+R1+R2=95+2900+3300=6255.
d) Add (R1)+,R5  This is Autoincrement mode. Contents of R1 are the EA so, 2900 is the EA.
e) Subtract -(R2),R5  This is Auto decrement mode. Here, R2 is subtracted by 4 bytes (assuming
32-bt processor) to generate the EA, so, EA= 3300-4=3296.

Problem 4:
Given a binary pattern in some memory-location, is it possible to tell whether this pattern represents a
machine instruction or a number?
Solution:
No; any binary pattern can be interpreted as a number or as an instruction.

38
COMPUTER ORGANIZATION

Problem 5:
Both of the following statements cause the value 300 to be stored in location 1000, but at different times.
ORIGIN 1000
DATAWORD 300
A
n Move #300,1000
d
Explain the difference.
Solution:
The assembler directives ORIGIN and DATAWORD cause the object program memory image
constructed by the assembler to indicate that 300 is to be placed at memory word location 1000 at the time
the program is loaded into memory prior to execution.
The Move instruction places 300 into memory word location 1000 when the instruction is executed as
part of a program.

Problem 6:
Register R5 is used in a program to point to the top of a stack. Write a sequence of instructions using the
Index, Autoincrement, and Autodecrement addressing modes to perform each of the following tasks:
(a) Pop the top two items off the stack, and them, and then push the result onto the stack.
(b) Copy the fifth item from the top into register R3.
(c) Remove the top ten items from the stack.
Solution:
(a) Move (R5)+,R0
Add (R5)+,R0 Move R0,-(R5)
(b) Move 16(R5),R3
(c) Add #40,R5

Problem 7:
Consider the following possibilities for saving the return address of a subroutine:
(a) In the processor register.
(b) In a memory-location associated with the call, so that a different location is used when the
subroutine is called from different places
(c) On a stack.
Which of these possibilities supports subroutine nesting and which supports subroutine recursion(that is,
a subroutine that calls itself)?
Solution:
(a) Neither nesting nor recursion is supported.
(b) Nesting is supported, because different Call instructions will save the return address at different
memory-locations. Recursion is not supported.
(c) Both nesting and recursion are supported.

39
COMPUTER ORGANIZATION

MODULE 2: INPUT/OUTPUT ORGANIZATION

ACCESSING I/O-DEVICES
• A single bus-structure can be used for connecting I/O-devices to a computer (Figure 7.1).
• Each I/O device is assigned a unique set of address.
• Bus consists of 3 sets of lines to carry address, data & control signals.
• When processor places an address on address-lines, the intended-device responds to the command.
• The processor requests either a read or write-operation.
• The requested-data are transferred over the data-lines.

• There are 2 ways to deal with I/O-devices: 1) Memory-mapped I/O & 2) I/O-mapped I/O.
1) Memory-Mapped I/O
 Memory and I/O-devices share a common address-space.
 Any data-transfer instruction (like Move, Load) can be used to exchange information.
 For example,
Move DATAIN, R0; This instruction sends the contents of location DATAIN to register R0.
Here, DATAIN  address of the input-buffer of the keyboard.
2) I/O-Mapped I/O
 Memory and I/0 address-spaces are different.
 A special instructions named IN and OUT are used for data-transfer.
 Advantage of separate I/O space: I/O-devices deal with fewer address-lines.
I/O Interface for an Input Device
1) Address Decoder: enables the device to recognize its address when this address
appears on the address-lines (Figure 7.2).
2) Status Register: contains information relevant to operation of I/O-device.
3) Data Register: holds data being transferred to or from processor. There are 2 types:
i) DATAIN  Input-buffer associated with keyboard.
ii) DATAOUT  Output data buffer of a display/printer.
COMPUTER ORGANIZATION

MECHANISMS USED FOR INTERFACING I/O-DEVICES


1) Program Controlled I/O
• Processor repeatedly checks status-flag to achieve required synchronization b/w processor & I/O
device. (We say that the processor polls the device).
• Main drawback:
The processor wastes time in checking status of device before actual data-transfer takes place.
2) Interrupt I/O
• I/O-device initiates the action instead of the processor.
• I/O-device sends an INTR signal over bus whenever it is ready for a data-transfer operation.
• Like this, required synchronization is done between processor & I/O device.
3) Direct Memory Access (DMA)
• Device-interface transfer data directly to/from the memory w/o continuous involvement by the
processor.
• DMA is a technique used for high speed I/O-device.
COMPUTER ORGANIZATION
INTERRUPTS
• There are many situations where other tasks can be performed while waiting for an I/O device to
become ready.
• A hardware signal called an Interrupt will alert the processor when an I/O device becomes ready.
• Interrupt-signal is sent on the interrupt-request line.
• The processor can be performing its own task without the need to continuously check the I/O-device.
• The routine executed in response to an interrupt-request is called ISR.
• The processor must inform the device that its request has been recognized by sending INTA signal.
(INTR  Interrupt Request, INTA  Interrupt Acknowledge, ISR  Interrupt Service Routine)
• For example, consider COMPUTE and PRINT routines (Figure 3.6).

• The processor first completes the execution of instruction i.


• Then, processor loads the PC with the address of the first instruction of the ISR.
• After the execution of ISR, the processor has to come back to instruction i+1.
• Therefore, when an interrupt occurs, the current content of PC is put in temporary storage location.
• A return at the end of ISR reloads the PC from that temporary storage location.
• This causes the execution to resume at instruction i+1.
• When processor is handling interrupts, it must inform device that its request has been recognized.
• This may be accomplished by INTA signal.
• The task of saving and restoring the information can be done automatically by the processor.
• The processor saves only the contents of PC & Status register.
• Saving registers also increases the Interrupt Latency.
• Interrupt Latency is a delay between
→ time an interrupt-request is received and
→ start of the execution of the ISR.
• Generally, the long interrupt latency in unacceptable.

Difference between Subroutine & ISR


Subroutine ISR
A subroutine performs a function required by the ISR may not have anything in common with
program from which it is called. program being executed at time INTR is received
Subroutine is just a linkage of 2 or more function Interrupt is a mechanism for coordinating I/O
related to each other. transfers.
COMPUTER ORGANIZATION
INTERRUPT HARDWARE
• Most computers have several I/O devices that can request an interrupt.
• A single interrupt-request (IR) line may be used to serve n devices (Figure 4.6).
• All devices are connected to IR line via switches to ground.
• To request an interrupt, a device closes its associated switch.
• Thus, if all IR signals are inactive, the voltage on the IR line will be equal to Vdd.
• When a device requests an interrupt, the voltage on the line drops to 0.
• This causes the INTR received by the processor to go to 1.
• The value of INTR is the logical OR of the requests from individual devices.
INTR=INTR1+ INTR2+ ............................ +INTRn
• A special gates known as open-collector or open-drain are used to drive the INTR line.
• The Output of the open collector control is equal to a switch to the ground that is
→ open when gates input is in ”0‟ state and
→ closed when the gates input is in “1‟ state.
• Resistor R is called a Pull-up Resistor because
it pulls the line voltage up to the high-voltage state when the switches are open.

ENABLING & DISABLING INTERRUPTS


• All computers fundamentally should be able to enable and disable interruptions as desired.
• The problem of infinite loop occurs due to successive interruptions of active INTR signals.
• There are 3 mechanisms to solve problem of infinite loop:
1) Processor should ignore the interrupts until execution of first instruction of the ISR.
2) Processor should automatically disable interrupts before starting the execution of the ISR.
3) Processor has a special INTR line for which the interrupt-handling circuit.
Interrupt-circuit responds only to leading edge of signal. Such line is called edge-triggered.
• Sequence of events involved in handling an interrupt-request:
1) The device raises an interrupt-request.
2) The processor interrupts the program currently being executed.
3) Interrupts are disabled by changing the control bits in the processor status register (PS).
4) The device is informed that its request has been recognized.
In response, the device deactivates the interrupt-request signal.
5) The action requested by the interrupt is performed by the interrupt-service routine.
6) Interrupts are enabled and execution of the interrupted program is resumed.

4 |Page
COMPUTER ORGANIZATION
HANDLING MULTIPLE DEVICES
• While handling multiple devices, the issues concerned are:
1) How can the processor recognize the device requesting an interrupt?
2) How can the processor obtain the starting address of the appropriate ISR?
3) Should a device be allowed to interrupt the processor while another interrupt is being
serviced?
4) How should 2 or more simultaneous interrupt-requests be handled?

POLLING
• Information needed to determine whether device is requesting interrupt is available in status-register
• Following condition-codes are used:
 DIRQ  Interrupt-request for display.
 KIRQ  Interrupt-request for keyboard.
 KEN  keyboard enable.
 DEN  Display Enable.
 SIN, SOUT  status flags.
• For an input device, SIN status flag in used.
SIN = 1  when a character is entered at the keyboard.
SIN = 0  when the character is read by processor.
IRQ=1  when a device raises an interrupt-requests (Figure 4.3).
• Simplest way to identify interrupting-device is to have ISR poll all devices connected to bus.
• The first device encountered with its IRQ bit set is serviced.
• After servicing first device, next requests may be serviced.
• Advantage: Simple & easy to implement.
Disadvantage: More time spent polling IRQ bits of all devices.

5 |Page
COMPUTER ORGANIZATION
VECTORED INTERRUPTS
• A device requesting an interrupt identifies itself by sending a special-code to processor over bus.
• Then, the processor starts executing the ISR.
• The special-code indicates starting-address of ISR.
• The special-code length ranges from 4 to 8 bits.
• The location pointed to by the interrupting-device is used to store the staring address to ISR.
• The staring address to ISR is called the interrupt vector.
• Processor
→ loads interrupt-vector into PC &
→ executes appropriate ISR.
• When processor is ready to receive interrupt-vector code, it activates INTA line.
• Then, I/O-device responds by sending its interrupt-vector code & turning off the INTR signal.
• The interrupt vector also includes a new value for the Processor Status Register.

CONTROLLING DEVICE REQUESTS


• Following condition-codes are used:
 KEN  Keyboard Interrupt Enable.
 DEN  Display Interrupt Enable.
 KIRQ/DIRQ  Keyboard/Display unit requesting an interrupt.
• There are 2 independent methods for controlling interrupt-requests. (IE  interrupt-enable).
1) At Device-end
IE bit in a control-register determines whether device is allowed to generate an interrupt-request.
2) At Processor-end, interrupt-request is determined by
→ IE bit in the PS register or
→ Priority structure

6 |Page
COMPUTER ORGANIZATION
INTERRUPT NESTING
• A multiple-priority scheme is implemented by using separate INTR & INTA lines for each device
• Each INTR line is assigned a different priority-level (Figure 4.7).
• Priority-level of processor is the priority of program that is currently being executed.
• Processor accepts interrupts only from devices that have higher-priority than its own.
• At the time of execution of ISR for some device, priority of processor is raised to that of the device.
• Thus, interrupts from devices at the same level of priority or lower are disabled.
Privileged Instruction
• Processor's priority is encoded in a few bits of PS word. (PS  Processor-Status).
• Encoded-bits can be changed by Privileged Instructions that write into PS.
• Privileged-instructions can be executed only while processor is running in Supervisor Mode.
• Processor is in supervisor-mode only when executing operating-system routines.
Privileged Exception
• User program cannot
→ accidently or intentionally change the priority of the processor &
→ disrupt the system-operation.
• An attempt to execute a privileged-instruction while in user-mode leads to a Privileged Exception.

7 |Page
COMPUTER ORGANIZATION
SIMULTANEOUS REQUESTS
• The processor must have some mechanisms to decide which request to service when simultaneous
requests arrive.
• INTR line is common to all devices (Figure 4.8a).
• INTA line is connected in a daisy-chain fashion.
• INTA signal propagates serially through devices.
• When several devices raise an interrupt-request, INTR line is activated.
• Processor responds by setting INTA line to 1. This signal is received by device 1.
• Device-1 passes signal on to device 2 only if it does not require any service.
• If device-1 has a pending-request for interrupt, the device-1
→ blocks INTA signal &
→ proceeds to put its identifying-code on data-lines.
• Device that is electrically closest to processor has highest priority.
• Advantage: It requires fewer wires than the individual connections.
Arrangement of Priority Groups
• Here, the devices are organized in groups & each group is connected at a different priority level.
• Within a group, devices are connected in a daisy chain. (Figure 4.8b).

8 |Page
COMPUTER ORGANIZATION
DIRECT MEMORY ACCESS (DMA)
• The transfer of a block of data directly b/w an external device & main-memory w/o continuous
involvement by processor is called DMA.
• DMA controller
→ is a control circuit that performs DMA transfers (Figure 8.13).
→ is a part of the I/O device interface.
→ performs the functions that would normally be carried out by processor.
• While a DMA transfer is taking place, the processor can be used to execute another program.

• DMA interface has three registers (Figure 8.12):


1) First register is used for storing starting-address.
2) Second register is used for storing word-count.
3) Third register contains status- & control-flags.

• The R/W bit determines direction of transfer.


If R/W=1, controller performs a read-operation (i.e. it transfers data from memory to I/O),
Otherwise, controller performs a write-operation (i.e. it transfers data from I/O to memory).
• If Done=1, the controller
→ has completed transferring a block of data and
→ is ready to receive another command. (IE  Interrupt Enable).
• If IE=1, controller raises an interrupt after it has completed transferring a block of data.
• If IRQ=1, controller requests an interrupt.
• Requests by DMA devices for using the bus are always given higher priority than processor requests.
• There are 2 ways in which the DMA operation can be carried out:
1) Processor originates most memory-access cycles.
 DMA controller is said to "steal" memory cycles from processor.
 Hence, this technique is usually called Cycle Stealing.
2) DMA controller is given exclusive access to main-memory to transfer a block of data without
any interruption. This is known as Block Mode (or burst mode).

9 |Page
COMPUTER ORGANIZATION

10 | P a g e
COMPUTER ORGANIZATION

11 | P a g e
COMPUTER ORGANIZATION

MODULE 4: MEMORY SYSTEM

BASIC CONCEPTS

• Maximum size of memory that can be used in any computer is determined by addressing
mode.
• If MAR is k-bits long then

→ memory may contain upto 2K addressable-locations


• If MDR is n-bits long, then
→ n-bits of data are transferred between the memory and processor.
• The data-transfer takes place over the processor-bus (Figure 8.1).
• The processor-bus has
1) Address-Line
2) Data-line &
3) Control-Line (R/W‟, MFC – Memory Function Completed).
• The Control-Line is used for coordinating data-transfer.
• The processor reads the data from the memory by
→ loading the address of the required memory-location into MAR and
→ setting the R/W‟ line to 1.
• The memory responds by
→ placing the data from the addressed-location onto the data-lines and
→ confirms this action by asserting MFC signal.
COMPUTER ORGANIZATION
• Upon receipt of MFC signal, the processor loads the data from the data-lines into MDR.
• The processor writes the data into the memory-location by
→ loading the address of this location into MAR &
→ setting the R/W‟ line to 0.
• Memory Access Time: It is the time that elapses between
→ initiation of an operation &
→ completion of that operation.
• Memory Cycle Time: It is the minimum time delay that required between the initiation
of the twosuccessive memory-operations.

RAM (Random Access Memory)


• In RAM, any location can be accessed for a Read/Write-operation in fixed amount of time,
Cache Memory
 It is a small, fast memory that is inserted between
→ larger slower main-memory and
→ processor.
 It holds the currently active segments of a program and their data.
Virtual Memory
 The address generated by the processor is referred to as a virtual/logical address.
 The virtual-address-space is mapped onto the physical-memory where data are
actuallystored.
 The mapping-function is implemented by MMU. (MMU = memory management
unit).
 Only the active portion of the address-space is mapped into locations in the physical-
memory.
 The remaining virtual-addresses are mapped onto the bulk storage devices such as
magneticdisk.
 As the active portion of the virtual-address-space changes during program
execution, theMMU
→ changes the mapping-function &
→ transfers the data between disk and memory.
 During every memory-cycle, MMU determines whether the addressed-page is in
the memory. If the page is in the memory.
Then, the proper word is accessed and execution proceeds.
Otherwise, a page containing desired word is transferred from disk to
memory.
COMPUTER ORGANIZATION

• Memory can be classified as follows:


1) RAM which can be further classified as follows:
i) Static RAM
ii) Dynamic RAM (DRAM) which can be further classified as synchronous &
asynchronousDRAM.
2) ROM which can be further classified as follows:
i) PROM
ii) EPROM
iii) EEPROM &
iv) Flash Memory which can be further classified as Flash Cards & Flash Drives.

SEMI CONDUCTOR RAM MEMORIES


INTERNAL ORGANIZATION OF MEMORY-CHIPS
• Memory-cells are organized in the form of array (Figure 8.2).

• Each cell is capable of storing 1-bit of information.

• Each row of cells forms a memory-word.

• All cells of a row are connected to a common line called as Word-Line.

• The cells in each column are connected to Sense/Write circuit by 2-bit-lines.

• The Sense/Write circuits are connected to data-input or output lines of the chip.

• During a write-operation, the sense/write circuit

→ receive input information &

→ store input info in the cells of the selected word.


COMPUTER ORGANIZATION

• The data-input and data-output of each Sense/Write circuit


are connected to a single bidirectionaldata-line.
• Data-line can be connected to a data-bus of the computer.
• Following 2 control lines are also used:
1) R/W’  Specifies the required operation.

2) CS’  Chip Select input selects a given chip in the multi-chip memory-system.
COMPUTER ORGANIZATION

STATIC RAM (OR MEMORY)


• Memories consist of circuits capable of retaining their state as long as power is applied are
known.

• Two inverters are cross connected to form a latch (Figure 8.4).


• The latch is connected to 2-bit-lines by transistors T1 and T2.
• The transistors act as switches that can be opened/closed under the control of the word-line.
• When the word-line is at ground level, the transistors are turned off and the latch retain its
state.
Read Operation
• To read the state of the cell, the word-line is activated to close switches T1 and T2.
• If the cell is in state 1, the signal on bit-line b is high and the signal on the bit-line b‟ is low.
• Thus, b and b‟ are complement of each other.
• Sense/Write circuit
→ monitors the state of b & b‟ and
→ sets the output accordingly.
Write Operation
• The state of the cell is set by
→ placing the appropriate value on bit-line b and its complement on b‟ and
→ then activating the word-line. This forces the cell into the corresponding state.
• The required signal on the bit-lines is generated by Sense/Write circuit.
COMPUTER ORGANIZATION

CMOS Cell
• Transistor pairs (T3, T5) and (T4, T6) form the inverters in the latch (Figure 8.5).
• In state 1, the voltage at point X is high by having T5, T6 ON and T4, T5 are OFF.
• Thus, T1 and T2 returned ON (Closed), bit-line b and b‟ will have high and low signals
respectively.
• Advantages:
1) It has low power consumption „.‟ the current flows in the cell only when the cell is
active.
2) Static RAM‟s can be accessed quickly. It access time is few nanoseconds.
• Disadvantage: SRAMs are said to be volatile memories
„.‟ their contents are lost when poweris interrupted.
COMPUTER ORGANIZATION

ASYNCHRONOUS DRAM
• Less expensive RAMs can be implemented if simple cells are used.
• Such cells cannot retain their state indefinitely. Hence they are called Dynamic RAM
(DRAM).
• The information stored in a dynamic memory-cell in the form of a charge on a capacitor.
• This charge can be maintained only for tens of milliseconds.
• The contents must be periodically refreshed by restoring this capacitor charge to its full value.

• In order to store information in the cell, the transistor T is turned „ON‟ (Figure 8.6).
• The appropriate voltage is applied to the bit-line which charges the capacitor.
• After the transistor is turned off, the capacitor begins to discharge.
• Hence, info. stored in cell can be retrieved correctly before threshold value of capacitor drops
down.
• During a read-operation,
→ transistor is turned „ON‟
→ a sense amplifier detects whether the charge on the capacitor is above the threshold
value.
 If (charge on capacitor) > (threshold value)  Bit-line will have logic value „1‟.
 If (charge on capacitor) < (threshold value)  Bit-line will set to logic value
„0‟.
COMPUTER ORGANIZATION
ASYNCHRONOUS DRAM DESCRIPTION
• The 4 bit cells in each row are divided into 512 groups of 8 (Figure 5.7).
• 21 bit address is needed to access a byte in the memory. 21 bit is divided as follows:
1) 12 address bits are needed to select a row.
i.e. A8-0 → specifies row-address of a byte.
2) 9 bits are needed to specify a group of 8 bits in the selected row.
i.e. A20-9 → specifies column-address of a byte.

• During Read/Write-operation,
→ row-address is applied first.
→ row-address is loaded into row-latch in response to a signal pulse on RAS’
input of chip.(RAS = Row-address Strobe CAS = Column-address Strobe)
• When a Read-operation is initiated, all cells on the selected row are read and refreshed.
• Shortly after the row-address is loaded, the column-address is
→ applied to the address pins &
→ loaded into CAS’.
• The information in the latch is decoded.
• The appropriate group of 8 Sense/Write circuits is selected.
R/W’=1(read-operation)  Output values of selected circuits are transferred to data-lines
D0-D7.
R/W’=0(write-operation)  Information on D0-D7 are transferred to the selected
circuits.
COMPUTER ORGANIZATION

RAS‟ & CAS‟ are active-low so that they cause latching of address when they change
from highto low.
• To ensure that the contents of DRAMs are maintained, each row of cells is accessed
periodically.
• A special memory-circuit provides the necessary control signals RAS‟ & CAS‟ that govern
the timing.
• The processor must take into account the delay in the response of the memory.
Fast Page Mode
 Transferring the bytes in sequential order is achieved by applying the consecutive
sequenceof column-address under the control of successive CAS‟ signals.
 This scheme allows transferring a block of data at a faster rate.
 The block of transfer capability is called as fast page mode.

READ ONLY MEMORY (ROM)

Both SRAM and DRAM chips are volatile, i.e. They lose the stored information if power is
turned off.

Many application requires non-volatile memory which retains the stored information if
power isturned off.

For ex:

OS software has to be loaded from disk to memory i.e. it requires non-volatile memory.

Non-volatile memory is used in embedded system.

Since the normal operation involves only reading of stored data, a memory of this type is
called ROM.

At Logic value ‘0’  Transistor(T) is connected to the ground point (P).

Transistor switch is closed & voltage on bit-line nearly drops to zero (Figure 8.11).

At Logic value ‘1’  Transistor switch is open.The bit-line remains at high voltage.
COMPUTER ORGANIZATION

• To read the state of the cell, the word-line is activated.


• A Sense circuit at the end of the bit-line generates the proper output value.

TYPES OF ROM
• Different types of non-volatile memory are
1) PROM
2) EPROM
3) EEPROM &
4) Flash Memory (Flash Cards & Flash Drives)

PROM (PROGRAMMABLE ROM)


• PROM allows the data to be loaded by the user.
• Programmability is achieved by inserting a „fuse‟ at point P in a ROM cell.
• Before PROM is programmed, the memory contains all 0‟s.
• User can insert 1‟s at required location by burning-out fuse using high current-pulse.
• This process is irreversible.
• Advantages:
1) It provides flexibility.
2) It is faster.
3) It is less expensive because they can be programmed directly by the user.
COMPUTER ORGANIZATION

EPROM (ERASABLE REPROGRAMMABLE ROM)


• EPROM allows
→ stored data to be erased and
→ new data to be loaded.
• In cell, a connection to ground is always made at „P‟ and a special transistor is used.
• The transistor has the ability to function as
→ a normal transistor or
→ a disabled transistor that is always turned „off‟.
• Transistor can be programmed to behave as a permanently open switch, by injecting charge
into it.
• Erasure requires dissipating the charges trapped in the transistor of
memory-cells. This can be done by exposing the chip to ultra-violet
light.
• Advantages:
1) It provides flexibility during the development-phase of digital-system.
2) It is capable of retaining the stored information for a long time.
• Disadvantages:
1) The chip must be physically removed from the circuit for reprogramming.
2) The entire contents need to be erased by UV light.

EEPROM (ELECTRICALLY ERASABLE ROM)


• Advantages:
1) It can be both programmed and erased electrically.
2) It allows the erasing of all cell contents selectively.
• Disadvantage: It requires different voltage for erasing, writing and reading the stored data.

FLASH MEMORY
• In EEPROM, it is possible to read & write the contents of a single cell.
• In Flash device, it is possible to read contents of a single cell & write entire contents of a
block.
• Prior to writing, the previous contents of the block are erased.
Eg. In MP3 player, the flash memory stores the data that represents sound.
• Single flash chips cannot provide sufficient storage capacity for embedded-system.
• Advantages:
1) Flash drives have greater density which leads to higher capacity & low cost per bit.
2) It requires single power supply voltage & consumes less power.
COMPUTER ORGANIZATION
• There are 2 methods for implementing larger memory: 1) Flash Cards & 2) Flash Drives
1) Flash Cards
 One way of constructing larger module is to mount flash-chips on a small card.
 Such flash-card have standard interface.
 The card is simply plugged into a conveniently accessible slot.
 Memory-size of the card can be 8, 32 or 64MB.
 Eg: A minute of music can be stored in 1MB of memory. Hence 64MB flash cards
can store anhour of music.
2) Flash Drives
 Larger flash memory can be developed by replacing the hard disk-drive.
 The flash drives are designed to fully emulate the hard disk.
 The flash drives are solid state electronic devices that have no movable parts.
Advantages:
1) They have shorter seek & access time which results in faster response.
2) They have low power consumption. .‟. they are attractive for battery
drivenapplication.
3) They are insensitive to vibration.
Disadvantages:
1) The capacity of flash drive (<1GB) is less than hard disk (>1GB).
2) It leads to higher cost per bit.
3) Flash memory will weaken after it has been written a number of times
(typically atleast 1 million times).

SPEED, SIZE COST


COMPUTER ORGANIZATION

• The main-memory can be built with DRAM (Figure 8.14)


• Thus, SRAM‟s are used in smaller units where speed is of essence.
• The Cache-memory is of 2 types:
1) Primary/Processor Cache (Level1 or L1 cache)
 It is always located on the processor-chip.
2) Secondary Cache (Level2 or L2 cache)
 It is placed between the primary-cache and the rest of the memory.
• The memory is implemented using the dynamic components (SIMM, RIMM, DIMM).
• The access time for main-memory is about 10 times longer than the access time for L1 cache.
COMPUTER ORGANIZATION

CACHE MEMORIES
• The effectiveness of cache mechanism is based on the property of „Locality of
Reference’.Locality of Reference
• Many instructions in the localized areas of program are executed repeatedly during some
time period
• Remainder of the program is accessed relatively infrequently (Figure 8.15).
• There are 2 types:
1) Temporal
 The recently executed instructions are likely to be executed again very soon.
2) Spatial
 Instructions in close proximity to recently executed instruction are also likely to be
executed soon.
• If active segment of program is placed in cache-memory, then total execution time can be
reduced.
• Block refers to the set of contiguous address locations of some size.
• The cache-line is used to refer to the cache-block.

• The Cache-memory stores a reasonable number of blocks at a given time.


• This number of blocks is small compared to the total number of blocks available in main-
memory.
• Correspondence b/w main-memory-block & cache-memory-block is specified by mapping-
function.
• Cache control hardware decides which block should be removed to create space for the new
block.
• The collection of rule for making this decision is called the Replacement Algorithm.
• The cache control-circuit determines whether the requested-word currently exists in the
cache.
• The write-operation is done in 2 ways: 1) Write-through protocol & 2) Write-back protocol.
Write-Through Protocol
 Here the cache-location and the main-memory-locations are updated simultaneously.
COMPUTER ORGANIZATION

Write-Back Protocol
 This technique is to
→ update only the cache-location &
→ mark the cache-location with associated flag bit called Dirty/Modified Bit.
 The word in memory will be updated later, when the marked-block is removed from
cache.
During Read-operation
• If the requested-word currently not exists in the cache, then read-miss will occur.
• To overcome the read miss, Load–through/Early restart protocol is used.
Load–Through Protocol
 The block of words that contains the requested-word is copied from the memory into
cache.
 After entire block is loaded into cache, the requested-word is forwarded to processor.
During Write-operation
• If the requested-word not exists in the cache, then write-miss will occur.
1) If Write Through Protocol is used, the information is written directly into main-
memory.
2) If Write Back Protocol is used,
→ then block containing the addressed word is first brought into the cache &
→ then the desired word in the cache is over-written with the new information.
COMPUTER ORGANIZATION

VIRTUAL MEMORY
• It refers to a technique that automatically move
program/data blocks into the main-memory when they are
required for execution (Figure 8.24).
• The address generated by the processor is referred to as a virtual/logical address.
• The virtual-address is translated into physical-address by MMU (Memory Management
Unit).
• During every memory-cycle, MMU determines whether the addressed-word is in
the memory.If the word is in memory.
Then, the word is accessed and execution proceeds.
Otherwise, a page containing desired word is transferred from disk to memory.
• Using DMA scheme, transfer of data between disk and memory is performed.
COMPUTER ORGANIZATION

SECONDARY-STORAGE
• The semi-conductor memories do not provide all the storage capability.
• The secondary-storage devices provide larger storage requirements.
• Some of the secondary-storage devices are:
1) Magnetic Disk
2) Optical Disk &
3) Magnetic Tapes

MAGNETIC DISK
• Magnetic Disk system consists of one or more disk mounted on a common spindle.
• A thin magnetic film is deposited on each disk (Figure 8.27).
• Disk is placed in a rotary-drive so that magnetized surfaces move in close proximity to R/W
heads.
• Each R/W head consists of 1) Magnetic Yoke & 2) Magnetizing-Coil.
• Digital information is stored on magnetic film by applying current pulse to the magnetizing-
coil.
• Only changes in the magnetic field under the head can be sensed during the Read-operation.
• Therefore, if the binary states 0 & 1 are represented by two opposite states,
then a voltage is induced in the head only at 0-1 and at 1-0 transition in the bit
stream.
• A consecutive of 0‟s & 1‟s are determined by using the clock.
• Manchester Encoding technique is used to combine the clocking information with data.
COMPUTER ORGANIZATION

• R/W heads are maintained at small distance from disk-surfaces in order to achieve high bit
densities.
• When disk is moving at their steady state, the air pressure develops b/w disk-surfaces
& head.This air pressure forces the head away from the surface.
• The flexible spring connection between head and its arm
mounting permits the head to fly at the desired distance away
from the surface.
Winchester Technology
• Read/Write heads are placed in a sealed, air–filtered enclosure called the Winchester
Technology.
• The read/write heads can operate closure to magnetic track surfaces because
the dust particles which are a problem in unsealed assemblies are absent.
Advantages
• It has a larger capacity for a given physical size.
• The data intensity is high because
the storage medium is not exposed to contaminating elements.
• The read/write heads of a disk system are movable.
• The disk system has 3 parts: 1) Disk Platter (Usually called Disk)
2) Disk-drive (spins the disk & moves Read/write heads)
3) Disk Controller (controls the operation of the system.)
COMPUTER ORGANIZATION

ORGANIZATION & ACCESSING OF DATA ON A DISK


• Each surface is divided into concentric Tracks (Figure 8.28).
• Each track is divided into Sectors.
• The set of corresponding tracks on all surfaces of a stack of disk form a Logical Cylinder.
• The data are accessed by specifying the surface number, track number and the sector number.
• The Read/Write-operation start at sector boundaries.
• Data bits are stored serially on each track.

• Each sector usually contains 512 bytes.


• Sector Header --> contains identification information.
It helps to find the desired sector on the selected track.
• ECC (Error checking code)- is used to detect and correct errors.
• An unformatted disk has no information on its tracks.
• The formatting process divides the disk physically into tracks and sectors.
• The formatting process may discover some defective sectors on all tracks.
• Disk Controller keeps a record of various defects.
• The disk is divided into logical partitions:
1) Primary partition
2) Secondary partition
• Each track has same number of sectors. So, all tracks have same storage capacity.
• Thus, the stored information is packed more densely on inner track than on outer track.
Access Time
• There are 2 components involved in the time-delay:
1) Seek time: Time required to move the read/write head to the proper track.
2) Latency/Rotational Delay: The amount of time that elapses after head is
COMPUTER ORGANIZATION
positioned over the correct track until the starting position of the addressed sector
passes under the R/W head.Seek time + Latency = Disk access time
Typical Disk
 One inch disk-weight = 1 ounce, size -> comparable
to match book Capacity -> 1GB
 Inch disk has the following
parameter Recording
surface=20
Tracks=15000
tracks/surface
Sectors=400.
Each sector stores 512 bytes of data

Capacity of formatted
disk=20x15000x400x512=60x109 =60GB Seek
time=3ms
Platter rotation=10000
rev/min Latency=3ms
Internet transfer rate=34MB/s

DATA BUFFER/CACHE
• A disk-drive that incorporates the required SCSI circuit is referred as SCSI Drive.
• The SCSI can transfer data at higher rate than the disk tracks.
• A data buffer can be used to deal with the possible difference in transfer rate b/w disk and
SCSI bus
• The buffer is a semiconductor memory.
• The buffer can also provide cache mechanism for the disk.
i.e. when a read request arrives at the disk, then controller first check if the data is
available inthe cache/buffer.
If data is available in cache.
Then, the data can be accessed & placed on
SCSI bus. Otherwise, the data will be retrieved
from the disk.
COMPUTER ORGANIZATION
DISK CONTROLLER
• The disk controller acts as interface between disk-drive and system-bus (Figure 8.13).
• The disk controller uses DMA scheme to transfer data between disk and memory.
• When the OS initiates the transfer by issuing R/W‟
request, the controllers register will load the following
information:
1) Memory Address: Address of first memory-location of the block of words
involved in thetransfer.
2) Disk Address: Location of the sector containing the beginning of the desired block of
words.
3) Word Count: Number of words in the block to be transferred.

• The disk-address issued by the OS is a logical address.


• The corresponding physical-address on the disk may be different.
• The controller's major functions are:
1) Seek - Causes disk-drive to move the R/W head from its current position to desired
track.
2) Read - Initiates a Read-operation, starting at address specified in the disk-address
register. Data read serially from the disk are assembled into words and placed
into the data bufferfor transfer to the main-memory.
3) Write - Transfers data to the disk.
4) Error Checking - Computes the error correcting code (ECC) value for the data
read from a given sector and compares it with the corresponding ECC value read
from the disk.
In case of a mismatch, it corrects the error if possible;
Otherwise, it raises an interrupt to inform the OS that an error has occurred.
COMPUTER ORGANIZATION

Floppy Disks

The disks discussed above are known as hard or rigid disk units. Floppy disks are

smaller, simpler, and cheaper disk units that consist of a flexible, removable, plastic
diskette

coated with magnetic material. The diskette is enclosed in a plastic jacket, which has an

opening where the read/write head can be positioned. A hole in the center of the diskette

allows a spindle mechanism in the disk drive to position and rotate the diskette.

The main feature of floppy disks is their low cost and shipping convenience. However,

they have much smaller storage capacities, longer access times, and higher failure rates

than hard disks. In recent years, they have largely been replaced by CDs, DVDs, and flash

cards as portable storage media.

RAID Disk Arrays

Processor speeds have increased dramatically. At the same time, access times to disk

drives are still on the order of milliseconds, because of the limitations of the mechanical

motion involved. One way to reduce access time is to use multiple disks operating in

parallel. In 1988, researchers at the University of California-Berkeley proposed such a

storage system [5]. They called it RAID, for Redundant Array of Inexpensive Disks.

(Since all disks are now inexpensive, the acronym was later reinterpreted as Redundant

Array of Independent Disks.) Using multiple disks also makes it possible to improve the

reliability of the overall system. Different configurations were proposed, and many more

have been developed since.

The basic configuration, known as RAID 0, is simple. A single large file is stored in
COMPUTER ORGANIZATION
several separate disk units by dividing the file into a number of smaller pieces and storing

these pieces on different disks. This is called data striping. When the file is accessed for

a Read operation, all disks access their portions of the data in parallel. As a result, the

rate at which the data can be transferred is equal to the data rate of individual disks times

the number of disks. However, access time, that is, the seek and rotational delay needed

to locate the beginning of the data on each disk, is not reduced. Since each disk operates

independently, access times vary. Individual pieces of the data are buffered, so that the

complete file can be reassembled and transferred to the memory as a single entity.

Various RAID configurations form a hierarchy, with each level in the hierarchy providing

additional features. For example, RAID 1 is intended to provide better reliability by

storing identical copies of the data on two disks rather than just one. The two disks are
said

to be mirrors of each other. If one disk drive fails, all Read and Write operations are
directed

to its mirror drive. Other levels of the hierarchy achieve increased reliability through

various parity-checking schemes, without requiring a full duplication of disks. Some also

have error-recovery capability.

The RAID concept has gained commercial acceptance. RAID systems are available

from many manufacturers for use with a variety of operating systems.


COMPUTER ORGANIZATION

MODULE 5: BASIC PROCESSING UNIT

SOME FUNDAMENTAL CONCEPTS


• To execute an instruction, processor has to perform following 3 steps:
1) Fetch contents of memory-location pointed to by PC. Content of this location is an instruction
to be executed. The instructions are loaded into IR, Symbolically, this operation is written as:
IR [[PC]]
2) Increment PC by 4.
PC [PC] +4
3) Carry out the actions specified by instruction (in the IR).
• The first 2 steps are referred to as Fetch Phase.
Step 3 is referred to as Execution Phase.
• The operation specified by an instruction can be carried out by performing one or more of the
following actions:
1) Read the contents of a given memory-location and load them into a register.
2) Read data from one or more registers.
3) Perform an arithmetic or logic operation and place the result into a register.
4) Store data from a register into a given memory-location.
• The hardware-components needed to perform these actions are shown in Figure 5.1.
COMPUTER ORGANIZATION
SINGLE BUS ORGANIZATION
• ALU and all the registers are interconnected via a Single Common Bus (Figure 7.1).
• Data & address lines of the external memory-bus is connected to the internal processor-bus via MDR
& MAR respectively. (MDR Memory Data Register, MAR  Memory Address Register).
• MDR has 2 inputs and 2 outputs. Data may be loaded
→ into MDR either from memory-bus (external) or
→ from processor-bus (internal).
• MAR‟s input is connected to internal-bus;
MAR‟s output is connected to external-
bus.
• Instruction Decoder & Control Unit is responsible for
→ issuing the control-signals to all the units inside the processor.
→ implementing the actions specified by the instruction (loaded in the IR).
• Register R0 through R(n-1) are the Processor Registers.
The programmer can access these registers for general-purpose use.
• Only processor can access 3 registers Y, Z & Temp for temporary storage during program-execution.
The programmer cannot access these 3 registers.
• In ALU, 1) „A‟ input gets the operand from the output of the multiplexer (MUX).
2) „B‟ input gets the operand directly from the processor-bus.
• There are 2 options provided for „A‟ input of the ALU.
• MUX is used to select one of the 2 inputs.
• MUX selects either
→ output of Y or
→ constant-value 4( which is used to increment PC content).

• An instruction is executed by performing one or more of the following operations:


1) Transfer a word of data from one register to another or to the ALU.
2) Perform arithmetic or a logic operation and store the result in a register.
3) Fetch the contents of a given memory-location and load them into a register.
4) Store a word of data from a register into a given memory-location.
• Disadvantage: Only one data-word can be transferred over the bus in a clock cycle.
Solution: Provide multiple internal-paths. Multiple paths allow several data-transfers to take place in
parallel.
COMPUTER ORGANIZATION
REGISTER TRANSFERS
• Instruction execution involves a sequence of steps in which data are transferred from one register to
another.
• For each register, two control-signals are used: Riin & Riout. These are called Gating Signals.
• Riin=1  data on bus is loaded into Ri. Riout=1
 content of Ri is placed on bus.
Riout=0,  bus can be used for transferring data from other registers.
• For example, Move R1, R2; This transfers the contents of register R1 to register R2. This can be
accomplished as follows:
1) Enable the output of registers R1 by setting R1out to 1 (Figure 7.2).
This places the contents of R1 on processor-bus.
2) Enable the input of register R2 by setting R2out to 1.
This loads data from processor-bus into register R4.
• All operations and data transfers within the processor take place within time-periods defined by the
processor-clock.
• The control-signals that govern a particular transfer are asserted at the start of the clock cycle.

Input & Output Gating for one Register Bit


• A 2-input multiplexer is used to select the data applied to the input of an edge-triggered D flip-flop.
• Riin=1  mux selects data on bus. This data will be loaded into flip-flop at rising-edge of clock.
Riin=0  mux feeds back the value currently stored in flip-flop (Figure 7.3).
• Q output of flip-flop is connected to bus via a tri-state gate.
Riout=0  gate's output is in the high-impedance state.
Riout=1  the gate drives the bus to 0 or 1, depending on the value of Q.
3
COMPUTER ORGANIZATION

4
COMPUTER ORGANIZATION
PERFORMING AN ARITHMETIC OR LOGIC OPERATION
• The ALU performs arithmetic operations on the 2 operands applied to its A and B inputs.
• One of the operands is output of MUX;
And, the other operand is obtained directly from processor-bus.
• The result (produced by the ALU) is stored temporarily in register Z.
• The sequence of operations for [R3][R1]+[R2] is as follows:
1) R1out, Yin
2) R2out, SelectY, Add, Zin
3) Zout, R3in
• Instruction execution proceeds as follows:
Step 1 --> Contents from register R1 are loaded into register Y.
Step2 --> Contents from Y and from register R2 are applied to the A and B inputs of ALU;
Addition is performed &
Result is stored in the Z register.
Step 3 --> The contents of Z register is stored in the R3 register.
• The signals are activated for the duration of the clock cycle corresponding to that step. All other
signals are inactive.

CONTROL-SIGNALS OF MDR
• The MDR register has 4 control-signals (Figure 7.4):
1) MDRin & MDRout control the connection to the internal processor data bus &
2) MDRinE & MDRoutE control the connection to the memory Data bus.
• MAR register has 2 control-signals.
1) MARin controls the connection to the internal processor address bus &
2) MARout controls the connection to the memory address bus.

5
COMPUTER ORGANIZATION
FETCHING A WORD FROM MEMORY
• To fetch instruction/data from memory, processor transfers required address to MAR.
At the same time, processor issues Read signal on control-lines of memory-bus.
• When requested-data are received from memory, they are stored in MDR. From MDR, they are
transferred to other registers.
• The response time of each memory access varies (based on cache miss, memory-mapped I/O). To
accommodate this, MFC is used. (MFC  Memory Function Completed).
• MFC is a signal sent from addressed-device to the processor. MFC informs the processor that the
requested operation has been completed by addressed-device.
• Consider the instruction Move (R1),R2. The sequence of steps is (Figure 7.5):
1) R1out, MARin, Read ;desired address is loaded into MAR & Read command is issued.
2) MDRinE, WMFC ;load MDR from memory-bus & Wait for MFC response from memory.
3) MDRout, R2in ;load R2 from MDR.
where WMFC=control-signal that causes processor's control.
circuitry to wait for arrival of MFC signal.

Storing a Word in Memory


• Consider the instruction Move R2,(R1). This requires the following sequence:
1) R1out, MARin ;desired address is loaded into MAR.
2) R2out, MDRin, Write ;data to be written are loaded into MDR & Write command is issued.
3) MDRoutE, WMFC ;load data into memory-location pointed by R1 from MDR.

6
COMPUTER ORGANIZATION
EXECUTION OF A COMPLETE INSTRUCTION
• Consider the instruction Add (R3),R1 which adds the contents of a memory-location pointed by R3 to
register R1. Executing this instruction requires the following actions:
1) Fetch the instruction.
2) Fetch the first operand.
3) Perform the addition &
4) Load the result into R1.

• Instruction execution proceeds as follows:


Step1--> The instruction-fetch operation is initiated by
→ loading contents of PC into MAR &
→ sending a Read request to memory.
The Select signal is set to Select4, which causes the Mux to select constant 4. This value
is added to operand at input B (PC‟s content), and the result is stored in Z.
Step2--> Updated value in Z is moved to PC. This completes the PC increment operation and
PC will now point to next instruction.
Step3--> Fetched instruction is moved into MDR and then to IR.
The step 1 through 3 constitutes the Fetch Phase.
At the beginning of step 4, the instruction decoder interprets the contents of the IR. This
enables the control circuitry to activate the control-signals for steps 4 through 7.
The step 4 through 7 constitutes the Execution Phase.
Step4--> Contents of R3 are loaded into MAR & a memory read signal is issued.
Step5--> Contents of R1 are transferred to Y to prepare for addition.
Step6--> When Read operation is completed, memory-operand is available in MDR, and the
addition is performed.
Step7--> Sum is stored in Z, then transferred to R1.The End signal causes a new instruction
fetch cycle to begin by returning to step1.

7
COMPUTER ORGANIZATION
BRANCHING INSTRUCTIONS
• Control sequence for an unconditional branch instruction is as follows:

• Instruction execution proceeds as follows:


Step 1-3--> The processing starts & the fetch phase ends in step3.
Step 4--> The offset-value is extracted from IR by instruction-decoding circuit.
Since the updated value of PC is already available in register Y, the offset X is gated onto
the bus, and an addition operation is performed.
Step 5--> the result, which is the branch-address, is loaded into the PC.
• The branch instruction loads the branch target address in PC so that PC will fetch the next instruction
from the branch target address.
• The branch target address is usually obtained by adding the offset in the contents of PC.
• The offset X is usually the difference between the branch target-address and the address
immediately following the branch instruction.
• In case of conditional branch,
we have to check the status of the condition-codes before loading a new value into the PC.
e.g.: Offset-field-of-IRout, Add, Zin, If N=0 then End
If N=0, processor returns to step 1 immediately after step 4.
If N=1, step 5 is performed to load a new value into PC.

8
COMPUTER ORGANIZATION
MULTIPLE BUS ORGANIZATION
• Disadvantage of Single-bus organization: Only one data-word can be transferred over the bus in
a clock cycle. This increases the steps required to complete the execution of the instruction
Solution: To reduce the number of steps, most processors provide multiple internal-paths. Multiple
paths enable several transfers to take place in parallel.
• As shown in fig 7.8, three buses can be used to connect registers and the ALU of the processor.
• All general-purpose registers are grouped into a single block called the Register File.
• Register-file has 3 ports:
1) Two output-ports allow the contents of 2 different registers to be simultaneously placed on
buses A & B.
2) Third input-port allows data on bus C to be loaded into a third register during the same
clock-cycle.
• Buses A and B are used to transfer source-operands to A & B inputs of ALU.
• The result is transferred to destination over bus C.
• Incrementer Unit is used to increment PC by 4.

• Instruction execution proceeds as follows:


Step 1--> Contents of PC are
→ passed through ALU using R=B control-signal &
→ loaded into MAR to start memory Read operation. At the same time, PC is incremented by 4.
Step2--> Processor waits for MFC signal from memory.
Step3--> Processor loads requested-data into MDR, and then transfers them to IR.
Step4--> The instruction is decoded and add operation takes place in a single step.

9
COMPUTER ORGANIZATION
COMPLETE PROCESSOR
• This has separate processing-units
units to deal with integer data and floating-point
floating data.
Integer Unit  To process integer data. (Figure 7.14).
Floating Unit  To process floating –point data.
• Data-Cache is inserted between these processing-units
processing & main-memory.
The integer and floating unit gets data from data cache.
• Instruction-Unit fetches instructions
→ from an instruction-cache or
→ from main-memory
memory when desired instructions are not already in cache.
• Processor is connected to system-bus
bus &
hence to the rest of the computer by means of a Bus Interface.
• Using separate caches for instructions & data is common practice in many processors today.
• A processor may include several units of each type to increase the potential for concurrent
operations.
• The 80486 processor has 8-kbytes
kbytes single cache for both instruction and data.
Whereas the Pentium processor has two separate 8 kbytes caches for instruction
instruction and data.

10
COMPUTER ORGANIZATION
Note:
To execute instructions, the processor must have some means of generating the control-signals. There
are two approaches for this purpose:
1) Hardwired control and 2) Microprogrammed control.
HARDWIRED CONTROL
• Hardwired control is a method of control unit design (Figure 7.11).
• The control-signals are generated by using logic circuits such as gates, flip-flops, decoders etc.
• Decoder/Encoder Block is a combinational-circuit that generates required control-outputs
depending on state of all its inputs.
• Instruction Decoder
 It decodes the instruction loaded in the IR.
 If IR is an 8 bit register, then instruction decoder generates 28(256 lines); one for each
instruction.
 It consists of a separate output-lines INS1 through INSm for each machine instruction.
 According to code in the IR, one of the output-lines INS1 through INSm is set to 1, and all
other lines are set to 0.
• Step-Decoder provides a separate signal line for each step in the control sequence.
• Encoder
 It gets the input from instruction decoder, step decoder, external inputs and condition codes.
 It uses all these inputs to generate individual control-signals: Yin, PCout, Add, End and so on.
 For example (Figure 7.12), Zin=T1+T6.ADD+T4.BR
;This signal is asserted during time-slot T1 for all instructions.
during T6 for an Add instruction.
during T4 for unconditional branch instruction
• When RUN=1, counter is incremented by 1 at the end of every clock cycle.
When RUN=0, counter stops counting.
• After execution of each instruction, end signal is generated. End signal resets step counter.
• Sequence of operations carried out by this machine is determined by wiring of logic circuits, hence
the name “hardwired”.
• Advantage: Can operate at high speed.
• Disadvantages:
1) Since no. of instructions/control-lines is often in hundreds, the complexity of control unit is
very high.
2) It is costly and difficult to design.
3) The control unit is inflexible because it is difficult to change the design.

11
COMPUTER ORGANIZATION
HARDWIRED CONTROL VS MICROPROGRAMMED CONTROL
Attribute Hardwired Control Microprogrammed Control
Definition Hardwired control is a control Micro programmed control is a control
mechanism to generate control- mechanism to generate control-signals
signals by using gates, flip- by using a memory called control store
flops, decoders, and other (CS), which contains the control-
digital circuits. signals.
Speed Fast Slow
Control functions Implemented in hardware. Implemented in software.
Flexibility Not flexible to accommodate More flexible, to accommodate new
new system specifications or system specification or new instructions
new instructions. redesign is required.
Ability to handle large Difficult. Easier.
or complex instruction
sets
Ability to support Very difficult. Easy.
operating systems &
diagnostic features
Design process Complicated. Orderly and systematic.
Applications Mostly RISC microprocessors. Mainframes, some microprocessors.
Instructionset size Usually under 100 instructions. Usually over 100 instructions.
ROM size - 2K to 10K by 20-400 bit
microinstructions.
Chip area efficiency Uses least area. Uses more area.
Diagram

12
COMPUTER ORGANIZATION
MICROPROGRAMMED CONTROL
• Microprogramming is a method of control unit design (Figure 7.16).
• Control-signals are generated by a program similar to machine language programs.
• Control Word(CW) is a word whose individual bits represent various control-signals (like Add, PCin).
• Each of the control-steps in control sequence of an instruction defines a unique combination of 1s &
0s in CW.
• Individual control-words in microroutine are referred to as microinstructions (Figure 7.15).
• A sequence of CWs corresponding to control-sequence of a machine instruction constitutes the
microroutine.
• The microroutines for all instructions in the instruction-set of a computer are stored in a special
memory called the Control Store (CS).
• Control-unit generates control-signals for any instruction by sequentially reading CWs of
corresponding microroutine from CS.
• µPC is used to read CWs sequentially from CS. (µPC Microprogram Counter).
• Every time new instruction is loaded into IR, o/p of Starting Address Generator is loaded into µPC.
• Then, µPC is automatically incremented by clock;
causing successive microinstructions to be read from CS.
Hence, control-signals are delivered to various parts of processor in correct sequence.

Advantages
• It simplifies the design of control unit. Thus it is both, cheaper and less error prone implement.
• Control functions are implemented in software rather than hardware.
• The design process is orderly and systematic.
• More flexible, can be changed to accommodate new system specifications or to correct the design
errors quickly and cheaply.
• Complex function such as floating point arithmetic can be realized efficiently.
Disadvantages
• A microprogrammed control unit is somewhat slower than the hardwired control unit, because time is
required to access the microinstructions from CM.
• The flexibility is achieved at some extra hardware cost due to the control memory and its access
circuitry.

13
COMPUTER ORGANIZATION
ORGANIZATION OF MICROPROGRAMMED CONTROL UNIT TO SUPPORT CONDITIONAL
BRANCHING
• Drawback of previous Microprogram control:
 It cannot handle the situation when the control unit is required to check the status of the
condition codes or external inputs to choose between alternative courses of action.
Solution:
 Use conditional branch microinstruction.
• In case of conditional branching, microinstructions specify which of the external inputs, condition-
codes should be checked as a condition for branching to take place.
• Starting and Branch Address Generator Block loads a new address into µPC when a
microinstruction instructs it to do so (Figure 7.18).
• To allow implementation of a conditional branch, inputs to this block consist of
→ external inputs and condition-codes &
→ contents of IR.
• µPC is incremented every time a new microinstruction is fetched from microprogram memory except
in following situations:
1) When a new instruction is loaded into IR, µPC is loaded with starting-address of microroutine
for that instruction.
2) When a Branch microinstruction is encountered and branch condition is satisfied, µPC is
loaded with branch-address.
3) When an End microinstruction is encountered, µPC is loaded with address of first CW in
microroutine for instruction fetch cycle.

14
COMPUTER ORGANIZATION

15

You might also like