DDCO - Module 5 - Part 1
DDCO - Module 5 - Part 1
Module 5
Basic Processing Unit
The processing unit executes machine instructions and coordinates the activities of other units. This unit is
often called the Instruction Set Processor (ISP), or simply the processor. It performs the tasks of fetching,
decoding, and executing instructions of a program. The processing unit used to be called the central processing
unit (CPU).
1. Fetch the contents of the memory location pointed to by the PC. The contents of this location are interpreted
as an instruction to be executed. Hence, they are loaded into the IR.
IR ←[[PC]]
2. Assuming that the memory is byte addressable, increment the contents of the PC by 4
PC ← [PC] + 4
3. Carry out the actions specified by the instruction in the IR.
In cases where an instruction occupies more than one word, steps 1 and 2 must be repeated as many times as
necessary to fetch the complete instruction. These two steps are usually referred to as the fetch phase; step 3
constitutes the execution phase.
Figure 7.1 shows an organization in which the arithmetic and logic unit (ALU) and all the registers are
interconnected via a single common bus. The data and address lines of the external memory bus are connected
to the internal processor bus via the memory data register, MDR, and the memory address register, MAR,
respectively. Register MDR has two inputs and two outputs. Data may be loaded into MDR either from the
memory bus or from the internal processor bus. The data stored in MDR may be placed on either bus, The input
of MAR is connected to the internal bus and its output is connected to the external bus. The control lines of the
memory bus are connected to the instruction decoder and control logic block. This unit is responsible for issuing
the signals that control the operation of all the units inside the processor and for interacting with the memory
bus.
The number and use of the processor registers RO through Rn - 1 vary considerably from one processor to another.
Registers may be provided for general-purpose use by the programmer. Some may be dedicated as special-
purpose registers, such as index registers or stack pointers. Three registers, Y, Z, and TEMP are used by the
processor for temporary storage during execution of some instructions. These registers are never used for storing
data generated by one instruction for later use by another instruction.
The multiplexer MUX selects either the output of register Y or a constant value 4 to be provided as input A of
the ALU. The constant 4 is used to increment the contents of the program counter. We will refer to the two
possible values of the MUX control input Select as Select4 and Selecty for selecting the constant 4 or register
Y respectively.
As instruction execution progresses, data are transferred from one register to another, often passing through the
ALU to perform some arithmetic or logic operation. The instruction decoder and control logic unit is responsible
for implementing the actions specified by the instruction loaded in the IR register. The decoder generates the
control signals needed to select the registers involved and direct the transfer of data. The registers, the ALU,
and the interconnecting bus are collectively referred to as the datapath.
With few exceptions, an instruction can be executed by performing one or more of the following operations in
some specified sequence:
Transfer a word of data from one processor register to another or to the ALU
Perform an arithmetic or a logic operation and store the result in a processor register
Fetch the contents of a given memory location and load them into a processor register
Store a word of data from a processor register into a given memory location
The input and output of register Ri are connected to the bus via switches controlled by the signals Riin and
Riout respectively. When Riin is set to 1, the data on the bus are loaded into Ri . When Riout is set to , the contents
of register Ri are placed on the bus. While Riout is equal to 0, the bus can be used for transferring data from
other registers.
Suppose that we wish to transfer the contents of register R1 to register R4. This can be accomplished as follows:
Enable the output of register R1 by setting Riout to 1. This places the contents of R1 on the processor
bus.
Enable the input of register R4 by setting Riin to 1. This loads data from the processor bus into register
R4.
All operations and data transfers within the processor take place within time periods defined by the processor
clock. The control signals that govern a particular transfer are asserted at the start of the clock cycle. In our
example, R1 out and R4 in are set to 1. The registers consist of edge-triggered flip-flops. Hence, at the next
active edge of the clock, the flip-flops that constitute R4 will load the data present at their inputs.
An implementation for one bit of register Ri is shown in Figure 7.3 as an example. A two-input multiplexer is
used to select the data applied to the input of an edge-triggered D flip-flop. When the control input Ri in is equal
to 1, the multiplexer selects the data on the bus. This data will be loaded into the flip-flop at the rising edge of
the clock. When Ri in is equal to 0, the multiplexer feeds back the value currently stored in the flip-flop.
The Q output of the flip-flop is connected to the bus via a tri-state gate. When Riout is equal to 0, the gate's
output is in the high-impedance (electrically disconnected) state. This corresponds to the open-circuit state of a
switch. When Riout =1 , the gate drives the bus to 0 or 1, depending on the value of Q.
1. R1out ,Yin
3. Z out ,R3 in
The signals whose names are given in any step are activated for the duration of the clock cycle corresponding
to that step. All other signals are inactive.
Hence, in step 1, the output of register R1 and the input of register Y are enabled, causing the contents
of R1 to be transferred over the bus to Y.
In step 2, the multiplexer's Select signal is set to SelectY, causing the multiplexer to gate the contents of
register Y to input A of the ALU. At the same time, the contents of register R2 are gated onto the bus
and, hence, to input B. The function performed by the ALU depends on the signals applied to its control
lines. In this case, the Add line is set to 1, causing the output of the ALU to be the sum of the two
numbers at inputs A and B. This sum is loaded into register Z because its input control signal is activated.
In step 3, the contents of register Z are transferred to the destination register, R3. This last transfer
cannot be carried out during step 2, because only one register output can be connected to the bus during
any clock cycle.
The connections for register MDR are illustrated in Figure 7.4. It has four control signals: MDRin and MDRout
control the connection to the internal bus, and MDRinE and MDRoutE control the connection to the external bus.
During memory Read and Write operations, the timing of internal processor operations must be coordinated
with the response of the addressed device on the memory bus. The processor completes one internal data transfer
in one clock cycle. The speed of operation of the addressed device, on the other hand, varies with the device.
To accommodate the variability in response time, the processor waits until it receives an indication that the
requested Read operation has been completed. A control signal called Memory-Function-Completed (MFC) is
used for this purpose. The addressed device sets this signal to 1 to indicate that the contents of the specified
location have been read and are available on the data lines of the memory bus.
As an example of a read operation, consider the instruction Move (R1),R2. The actions needed to execute this
instruction are:
1. MA R ←[R1]
2. Start a Read operation on the memory bus
3. Wait for the MFC response from the memory
4. Load MDR from the memory bus
5. R2 ← [MDR]
These actions may be carried out as separate steps, but some can be combined into a single step. Each action
can be completed in one clock cycle, except action 3 which requires one or more clock cycles, depending on
the speed of the addressed device.
For simplicity, let us assume that the output of MAR is enabled all the time. Thus, the contents of MAR are
always available on the address lines of the memory bus. This is the case when the processor is the bus master.
When a new address is loaded into MAR, it will appear on the memory bus at the beginning of the next clock
cycle, as shown in Figure 7.5. A Read control signal is activated at the same time MAR is loaded. This signal
will cause the bus interface circuit to send a read command, MR, on the bus. With this arrangement, we have
combined actions 1 and 2 above into a single control step. Actions 3 and 4 can also be combined by activating
control signal MDR inE while waiting for a response from the memory. Thus, the data received from the
memory are loaded into MDR at the end of the clock cycle in which the MFC signal is received. In the next
clock cycle, MDR out is activated to transfer the data to register R2. This means that the memory read operation
requires three steps, which can be described by the signals being activated as follows:
1. R1out, MARin ,Read
2. MDRinE , WMFC
3. MDRout , R2 in
where WMFC is the control signal that causes the processor's control circuitry to wait for the arrival of the
MFC signal.
Figure 7.5 shows that MDRinE is set to 1 for exactly the same period as the read command, MR.
1.4 STORING A WORD IN MEMORY
Writing a word into a memory location follows a similar procedure. The desired address is loaded into MAR.
Then, the data to be written are loaded into MDR, and a Write command is issued. Hence, executing the
instruction Move R2, (R1) requires the following sequence:
1. R1out , MAR in
2. R2 out , MDRin , Write
3. MDRoutE WMFC
As in the case of the read operation, the Write control signal causes the memory bus interface hardware to issue
a Write command on the memory bus. The processor remains in step 3 until the memory operation is completed
and an MFC response is received.
Figure 7.6 gives the sequence of control steps required to perform these operations for the single-bus
architecture of Figure 7.1. Instruction execution proceeds as follows.
In step 1, the instruction fetch operation is initiated by loading the contents of the PC into the MAR and
sending a Read request to the memory. The Select signal is set to Select4, which causes the multiplexer
MUX to select the constant 4. This value is added to the operand at input B, which is the contents of the
PC, and the result is stored in register Z.
The updated value is moved from register Z back into the PC during step 2, while waiting for the memory
to respond.
In step 3, the word fetched from the memory is loaded into the IR. Steps 1 through 3 constitute the
instruction fetch phase, which is the same for all instructions.
The instruction decoding circuit interprets the contents of the IR at the beginning of step 4. This enables
the control circuitry to activate the control signals for steps 4 through 7, which constitute the execution
phase. The contents of register R3 are transferred to the MAR in step 4, and a memory read operation is
initiated.
Then the contents of RI are transferred to register Y in step 5, to prepare for the addition operation.
When the Read operation is completed, the memory operand is available in register MDR, and the
addition operation is performed in step 6. The contents of MDR are gated to the bus, and thus also to
the B input of the ALU, and register Y is selected as the second input to the ALU by choosing SelectY.
The sum is stored in register Z, then transferred to R1 in step 7. The End signal causes a new instruction
fetch cycle to begin by returning to step 1.
2.1 BRANCH INSTRUCTIONS
A branch instruction replaces the contents of the PC with the branch target address. This address is usually
obtained by adding an offset X, which is given in the branch instruction, to the updated value of the PC.
Figure 7.7 gives a control sequence that implements an unconditional branch instruction. Processing starts, as
usual, with the fetch phase. This phase ends when the instruction is loaded into the IR in step 3. The offset value
is extracted from the IR by the instruction decoding circuit, which will also perform sign extension if required.
Since the value of the updated PC is already available in register Y, the offset X is gated onto the bus in step 4,
and an addition operation is performed. The result, which is the branch target address, is loaded into the PC in
step 5. The offset X used in a branch instruction is usually the difference between the branch target address and
the address immediately following the branch instruction.
For example, if the branch instruction is at location 2000 and if the branch target address is 2050, the value of
X must be 46. The PC is incremented during the fetch phase, before knowing the type of instruction being
executed. Thus, when the branch address is computed in step 4, the PC value used is the updated value, which
points to the instruction following the branch instruction in the memory. In a conditional branch, we need to
check the status of the condition codes before loading a new value into the PC. For example, for a Branch-on-
negative (Branch<0) instruction, step 4 in Figure 7.7 is replaced with
Offset-field-of- IR out ,Add, Z in If N = 0 then End
Thus, if N = 0 the processor returns to step 1 immediately after step 4. If N = 1 , step 5 is performed to load a
new value into the PC, thus performing the branch operation.