Module-4 Ddco
Module-4 Ddco
INPUT/OUTPUT ORGANIZATION
2) Bus consists of three sets of lines used to carry, data, and control lines.
4) When the processor places a particular address on the address line, the device that
recognizes this address responds to the commands issued on the control lines.
5) The processor requests either a read or a write operation, and the requested data are
transferred over the data lines.
1.1 I/O interface for an input device:
The hardware arrangement of connecting Input device to the system bus is called as “Device
Interface Or I/O Interface”.
I/O interface has three modules:
i)Address Decoder
ii) Control circuits
iii) Data & Status registers
i) Address Decoder: Address decoder is connected to the address lines of the bus as shown
in the fig:
The function of address decoder is: Address decoder enables the device to recognize its
address when this address appears on the address bus.
• Status register holds information necessary for the operation of the I/O device.
• Data and status registers are connected to the data lines, and have unique addresses.
Memory-mapped I/O:
In this technique both memory and I/O devices can share the common memory to
store the instruction as well as the operands.
Memory related instructions are used for data transfer between I/O and processor.
In case of memory mapped I/O input operation can be implemented as,
MOVE DATAIN, Ro
Similarly output can be implemented as,
iii)Program-controlled I/O:
In program-controlled I/O scheme the processor repeatedly checks a status flag of the
I/O devices to achieve the required synchronization between the processor and an
input or output device.
Disadvantage:
Processor checks the status flag of I/O device, if I/O device not ready for data
transfer, processor enters to wait loop.
During this period, the processor is not performing any useful computation.
There are many situations where other tasks can be performed while waiting for an
I/O device to become ready.
I/O devices operate at speeds that are vastly different from that of the processor.
When a human operator is entering characters at a keyboard, the processor is capable
of executing millions of instructions between successive character entries. An
instruction that reads a character from the keyboard should be executed only when a
character is available in the input buffer of the keyboard interface. An input character
is read only once.
For an input device such as a keyboard, a status flag, SIN, is included in the interface
circuit as part of the status register. This flag is set to 1 when a character is entered at
the keyboard and cleared to 0 once this character is read by the processor. Hence, by
checking the SIN flag, the software can ensure that it is always reading valid data.
This is often accomplished in a program loop that repeatedly reads the status register
and checks the state of SIN. When SIN becomes equal to 1, the program reads the
input data register. A similar procedure can be used to control output operations using
an output status flag, SOUT.
The program in Figure 5.4 reads a line of characters from the keyboard and stores it in a
memory buffer starting at location LINE. Then, it calls a subroutine PROCESS to process the
input line. As each character is read, it is echoed back to the display. Register R0 is used as a
pointer to the memory buffer area. The contents of R0 are updated using the Autoincrement
addressing mode so that successive characters are stored in successive memory locations.
Each character is checked to see if it is the Carriage Return (CR) character, which has the
ASCII code 0D (hex). If it is, a Line Feed character (ASCII code 0A) is sent to move the
cursor one line down on the display and subroutine PROCESS is called. Otherwise, the
program loops back to wait for another character from the keyboard.
In program-controlled I/O the processor repeatedly checks a status flag to achieve the
required synchronization between the processor and an input or output device. The processor
polls the device. There are two other commonly used mechanisms for implementing I/O
operations: interrupts and direct memory access. In the case of interrupts, synchronization is
achieved by having the I/O device send a special signal over the bus whenever it is ready for
a data transfer operation. Direct memory access is a technique used for high-speed I/O
devices. It involves having the device interface transfer data directly to or from the memory,
without continuous involvement by the processor.
Figure 5.4 A program that reads one line from the keyboard, stores it in memory buffer,
and echoes it back to the display.
Interrupts:
Definition: Interrupt is an event which suspends the execution of one program and begins the
execution of another program.
Interrupt signals and its function:
i) INT(Interrupt):
Interrupt is a hardware signal sent by I/O device when it is become ready to alert the
processor.
ii) INTR(Interrupt Request):
At least one of the control lines called interrupt request line is usually dedicated for this
purpose.
iii) INTA(Interrupt Acknowledgement):
Interrupt-Acknowledge signal is a special signal issued by the processor to the device that its
request has been recognized so that it may remove its interrupt request signal.
iv) ISR( interrupt-service routine):
The routine executed in response to an interrupt request is called the interrupt-service
routine(interrupt program).
4) When an interrupt occurs, control must be transferred to the interrupt service routine.
5) But before transferring control, the current contents of the PC (i+1), must be saved in a
known location.
7) Return address, or the contents of the PC are usually stored on the processor stack.
b. Interrupt-service routine may not have anything in common with the program it interrupts.
c. Interrupt-service routine and the program that it interrupts may belong to different users.
d. As a result, before branching to the interrupt-service routine, not only the PC, but other
information such as condition code flags, and processor registers used by both the interrupted
program and the interrupt service routine must be stored.
e. This will enable the interrupted program to resume execution upon return from interrupt
service routine.
10) Saving and restoring information can be done automatically by the processor or explicitly
by program instructions.
b. Increases the delay between the time an interrupt request is received, and the start of
execution of the interrupt-service routine.
This delay is called interrupt latency.
12) In order to reduce the interrupt latency, most processors save only the minimal amount of
information:
a. This minimal amount of information includes Program Counter and processor status
registers.
13) Any additional information that must be saved , must be saved explicitly by the program
instructions at the beginning of the interrupt service routine.
14) When a processor receives an interrupt-request, it must branch to the interrupt service
routine.
15) It must also inform the device that it has recognized the interrupt request.
Interrupt hardware:-
1) The external device (I/O device) request the processor by activating one bus line and this
bus line is called as interrupt request line.
2) The one end of this interrupt request line is connected to input power supply by means of
pull up register is as shown in the fig
3) Another end of interrupt request line is connected to INTR (Interrupt request) signal of
processor as shown in the figure above.
4) The I/O device is connected to interrupt request line by means of switch as shown in the
fig.
5) INTR is a INTerrupt Request signal which is sent by I/O device to request interrupt, INTR
is active-low signal.
6) Depends on the interrupt switch on and off we can consider two stares:
i) In-Active State
i)In-Active State:
When all the switches are open the voltage drop on interrupt request line is equal to the
VDD.
Therefore INTR=0 .This state is called as in-active state of the interrupt request line.
ii)Active State:
The I/O device interrupts the processor by closing its switch.
7) When switch is closed the voltage drop on the interrupt request line is found to be zero.
Therefore INTR=1
8) The signal on the interrupt request line is logical OR of requests from the several I/O
devices.
ENABLING AND DISABLING INTERRUPTS: There are many situations in which the
processor should ignore interrupt requests. For example, in the case of the Compute-Print
program of Figure 5.5, an interrupt request from the printer should be accepted only if there
are output lines to be printed. After printing the last line of a set of n lines, interrupts should
be disabled until another set becomes available for printing.
The second option, which is suitable for a simple processor with only one interrupt- request
line, is to have the processor automatically disable interrupts before starting the execution of
the interrupt-service routine. After saving the contents of the PC and the processor status
register (PS) on the stack, the processor performs the equivalent of executing an Interrupt -
disable instruction. It is often the case that one bit in the PS register, called Interrupt-enable,
indicates whether interrupts are enabled. An interrupt request received while this bit is equal
to 1 will be accepted. After saving the contents of the PS on the stack, with the Interrupt-
enable bit equal to 1, the processor clears the Interrupt-enable bit in its PS register, thus
disabling further interrupts. When a Return-from interrupt instruction is executed, the
contents of the PS are restored from the stack, setting the Interrupt-enable bit back to 1.
Hence, interrupts are again enabled.
In the third option, the processor has a special interrupt-request line for which the interrupt
handling circuit responds only to the leading edge of the signal. Such a line is said to be edge
triggered. In this case, the processor will receive only one request, regardless of how long the
line is activated. Hence, there is no danger of multiple interruptions and no need to explicitly
disable interrupt requests from this line. Before proceeding to study more complex aspects of
interrupts, let us summarize the sequence of events involved in handling an interrupt request
from a single device.
2) When a device raises an interrupt request, it sets IRQ bit to 1, which is in its status register.
Example: Bits KIRQ and DIRQ are the interrupt request bits for the keyboard and the
display, respectively.
3) The simplest way to identify the interrupting device is to have the interrupt service routine
poll all the I/O devices connected to the bus.
4) The first device encountered with its IRQ bit set is the device that should be serviced.
2) The first device with status bit is set to 1 is the device whose interrupt request is accepted.
4) Its main disadvantage in polling scheme is the time spent interrogating the IRQ bits of all
the devices that may not be requesting any service.
An alternative approach is to use vectored interrupts
Vectored Interrupt:
1) To reduce the time involved in the polling process, using vector interrupts.
2) “A device requesting an interrupt indentifies itself by sending a special code to the
processor over the bus”.
3) This enables the processor to identify individual devices even if they share a single
interrupt request line.
4) Interrupt vector code:
a) The code supplied by the device may represent the starting address of the interrupt
service routine for that device.
c) The location pointed to by the interrupting device is used to store the starting address of
the interrupt service routine.
d) The processor reads this address called as interrupt vector, and loads it into PC.
e) The interrupt vector may also include a new value for the processor status register.
f) In most computers, I/O Devices send the interrupt vector code over the data bus using the
bus control signals to ensure that the devices do not interface with each other.
g) When a device sends an interrupt request, the processor may not be ready to receive the
interrupt vector code immediately.
h) Then the processor can immediately transfer its service to interrupt service routine. Such
interrupts are known as vectored interrupts.
5) The remainder of the address is supplied by the processor based on the area in its memory
where the addresses for interrupt service routines are located
Priority level:
3) An interrupt request from a high-priority device should be accepted while the processor is
servicing another request from a lower-priority device.
4) To implement this scheme, assign a priority level to the processor that can be changed
under program control.
5) The priority level of the processor is the priority of the program that is currently being
executed.
6) The processor accepts interrupts only from devices that have priorities higher than its own.
7) The processor’s priority is usually encoded in a few bits of the processor status word.
Privileged instructions:
8) Processor’s priority can be changed by program instructions that write into the PS, These
are privileged instructions, which can be executed only while the processor is running in the
supervisor mode.
9) Processor works in different modes mainly supervisor mode and user mode
10) supervisor mode: The processor is in the supervisor mode only when executing
operating system routines
11) User mode: The processor is in the user mode only when executes user(including I/O
Interrupt programs) application program.
12) The processor is in the supervisor mode only when executing operating system routines.
13) It switches from supervisor mode to the user mode before beginning to execute
application programs.
14) Thus, a user program cannot accidentally, or intentionally, change the priority of the
processor and disrupt the system’s operation.
15) An attempt to execute a privileged instruction while in the user mode leads to a special
type of interrupt called a privileged instruction.
b)Multiple-Priority Scheme:
4) Interrupt requests received over these lines are sent to a priority arbitration circuit in the
processor.
5) If the interrupt request has a higher priority level than the priority of the processor, then the
request is accepted.
2.3.4 Simultaneous Requests:-
Consider the problem of simultaneous arrivals of interrupt requests from two or more
devices.
Daisy chain scheme:
Devices are connected to form a daisy chain as shown in the below fig.
1) The processor simply accepts the requests having the highest priority.
2) Devices share the interrupt-request line using common (single) bus line.
6) Received by device 1, if device 1 does not need service, it passes the signal to device 2.
7) Device that is electrically closest to the processor has the highest priority.
Arrangement of priority groups using daisy-chain fashion:
1) When I/O devices were organized into a priority structure, each device had its own
Interrupt-request and interrupt-acknowledge line.
2) When I/O devices were organized in a daisy chain fashion, the devices shared an interrupt-
request line, and the interrupt-acknowledge propagated through the devices.
3) A combination of priority structure and daisy chain scheme can also used.
4) Devices are organized into groups.
6) All the devices within a single group share an interrupt-request line, and are connected to
form a daisy chain
Direct Memory Access
Definition for DMA: “A special control unit used to provided to transfer a block of data with
high speed directly between an I/O device and the main memory, without continuous
intervention by the processor, this approach is called direct memory acces(DMA)”
Ex: Internal memory (RAM) data transfers and disk transfers uses DMA.
3.1)Direct Memory Access (DMA):
1) Control unit which performs DMA transfers is a part of the I/O device’s interface circuit.
For each word, it provides the memory address and all the control signals.
To transfer a block of data, it increments the memory addresses and keeps track of the
number of transfers.
3) DMA controller can transfer a block of data from an external device to the processor,
without any intervention from the processor.
4) However, the operation of the DMA controller must be under the control of any program
executed by the processor. That is, the processor must initiate the DMA transfer.
5) To initiate the DMA transfer, the processor informs the DMA controller of:
Starting address
Direction of transfer (I/O device to the memory, or memory to the I/O device).
6) After initiating the DMA transfer, the processor suspends the program that initiated the
transfer, and continues with the execution of some other program.
The program whose execution is suspended is said to be in the blocked state.
7) On receiving this information, the DMA controller proceeds to perform the requested
operation.
8) Once the DMA controller completes the DMA transfer, it informs the processor by raising
an interrupt signal.
9) While a DMA transfer is taking place, the program that requested the transfer cannot
continue, and the processor can be used to execute another program.
10) After the DMA transfer is completed, the processor can return to the program that
requested the transfer.
12) The OS is also responsible for suspending the execution of one program and starting
another.
13) Thus, for an I/O operation involving DMA, the OS puts the program that requested the
1transfer in the Blocked state, initiates the DMA operation, and starts the execution of
another program.
14) When the transfer is completed, the DMA controller informs the processor by sending an
interrupt request
15) In response, the OS puts the suspended program in the Runnable state so that it can be
selected by the scheduler to continue execution.
i) Registers of DMA:
1) Figure shows an example of the DMA controller three registers that are accessed by the
processor to initiate transfer operations.
2) Two registers are used for storing the Starting address and the word count.
When done flag=1 then DMA controller is ready to receive another command
When done flag=0 then DMA controller is not ready to receive another command
c) Interrupt enable(IE) flag register:
The DMA controller enables the interrupt enable bit after the completion of DMA
operation
An example of a computer system is given in figure, showing how DMA controllers may be
used.
1) cycle stealing
2) Burst mode
1) Cycle stealing:
1) Memory accesses by the processor and the DMA controller are interwoven.
2) Requests by DMA devices for using the bus are always given higher priority than
processor requests.
3) Among different DMA devices, top priority is given to high-speed peripherals such as a
disk, a high-speed network interface, or a graphics display device.
4) Since the processor originates most memory access cycles, the DMA controller can be said
to “steal” memory cycles from the processor.
2) Burst mode:
1) In this mode,the DMA controller may be given exclusive access to the main memory to
transfer a block of data without interruption. This is known as block or burst mode.
2) Most DMA controllers incorporate a data storage buffer. In the case of the network
interface
Example: the DMA controller reads a block of data from the main memory and stores it into
its input buffer. This transfer takes place using burst mode at a speed appropriate to the
memory and the computer bus.
3) Then, the data in the buffer are transmitted over the network at the speed of the network.
Conflicts in DMA:
A conflict may arise if both the processor and a DMA controller or two DMA
controllers try to use the bus at the same time to access the main memory.
The device that is allowed to initiate transfers on the bus at any given time is called
the bus master.
When the current bus master relinquishes its status as the bus master, another device
can acquire this status.
Bus Arbitration: The process by which the next device to become the bus master is selected
and bus mastership is transferred to it is called bus arbitration.
Purpose of Bus Arbitration:
Bus arbitration is required to resolve the conflict that arises when both the Processor and a
DMA controller or two DMA controllers try to use the bus at same time to access main
memory.
Bus arbitration is required coordinate the activities of all devices requesting memory
transfers.
2) Figure shows a basic arrangement in which processor contains the bus arbitration circuit.
3) In this case, the processor is normally the bus master unless it grants bus mastership to one
of the DMA controllers.
4) A DMA controller indicates that it needs to become the bus master by activating the BUS
request line, BR.
6) When the bus request line is activated, the processor activates the bus grant signal, BG1
indicating to the DMA controllers that they may use the bus when it becomes free.
4) When one or more devices request the bus, they assert the start arbitration signal and
place their 4-bit identification numbers on four lines, ARB0 through ARb3.
5) A winner is selected as a result of the interaction among the signals transmitted over these
lines by all contenders.
6) If one device puts 1 on the bus and another device puts 0 on the same bus line, the bus line
status will be 0.
Example:
1) Consider that two devices A and B having ID numbers 5 and 6 respectively are requesting
the use of bus.
2) Device A transmits the pattern 0101, and device B transmits the pattern 0110.
4) If it detects a difference at any bit position, it disables its drivers at that bit position and for
all lower-order bits. It does so by placing 0 at the input of these drivers.
5) In our example device A detects the difference on the line ARB1; hence it disables its
drivers on lines ARB1 and ARB0. This causes the pattern on the arbitration lines to change to
0110, which means that device B has won the contention.
Introduction: An ideal memory would be fast, large, and inexpensive. a very fast memory
can be implemented if SRAM chips are used. But these chips are expensive because their
basic cells have six transistors, which preclude packing a very large number of cells onto a
single chip. Thus, for cost reasons, it is impractical to build a large memory using SRAM
chips. The alternative is to use Dynamic RAM chips, which have much simpler basic cells
and thus are much less expensive. But such memories are significantly slower.
Although dynamic memory units in the range of hundreds of megabytes can be implemented
at a reasonable cost, the affordable size is still small compared to the demands of large
programs with voluminous data. A solution is provided by using secondary storage, mainly
magnetic disks, to implement large memory spaces. Very large disks are available at a
reasonable price, and they are used extensively in computer systems. However, they are much
slower than the semiconductor memory units. So A huge amount of cost-effective storage can
be provided by magnetic disks. A large, yet affordable, main memory can be built with
dynamic RAM technology. This leaves SRAMs to be used in smaller units where speed is of
the essence, such as in cache memories.
Figure 4.13 Memory hierarchy
All of these different types of memory units are employed effectively in a computer. The
entire computer memory can be viewed as the hierarchy depicted in Figure 4.13. The fastest
access is to data held in processor registers. Therefore, if we consider the registers to be part
of the memory hierarchy, then the processor registers are at the top in terms of the speed of
access. Of course, the registers provide only a minuscule portion of the required memory.
At the next level of the hierarchy is a relatively small amount of memory that can be
implemented directly on the processor chip. This memory, called a processor cache, holds
copies of instructions and data stored in a much larger memory that is provided externally.
There are often two levels of caches. A primary cache is always located on the processor
chip. This cache is small because it competes for space on the processor chip, which must
implement many other functions.
The primary cache is referred to as level (L1) cache. A larger, secondary cache is placed
between the primary cache and the rest of the memory. It is referred to as level 2 (L2) cache.
It is usually implemented using SRAM chips. It is possible to have both Ll and L2 caches on
the processor chip.
The next level in the hierarchy is called the main memory. This rather large memory is
implemented using dynamic memory components, typically in the form of SIMMs, DIMMs,
or RIMMs. The main memory is much larger but significantly slower than the cache memory.
In a typical computer, the access time for the main memory is about ten times longer than the
access time for the L 1 cache.
Disk devices provide a huge amount of inexpensive storage. They are very slow compared to
the semiconductor devices used to implement the main memory. A hard disk drive (HDD;
also hard drive, hard disk, magnetic disk or disk drive) is a device for storing and retrieving
digital information, primarily computer data. It consists of one or more rigid (hence "hard")
rapidly rotating discs (often referred to as platters), coated with magnetic material and with
magnetic heads arranged to write data to the surfaces and read it from them. During program
execution, the speed of memory access is of utmost importance. The key to managing the
operation of the hierarchical memory system in Figure 4.13 is to bring the instructions and
data that will be used in the near future as close to the processor as possible. This can be done
by using the hardware mechanisms.
CACHE MEMORIES
• Many instructions in the localized areas of program are executed repeatedly during some
time period
1) Temporal - The recently executed instructions are likely to be executed again very soon.
2) Spatial - Instructions in close proximity to recently executed instruction are also likely to
be executed soon.
• If active segment of program is placed in cache-memory, then total execution time can be
reduced.
• This number of blocks is small compared to the total number of blocks available in main-
memory.
• Cache control hardware decides which block should be removed to create space for the new
block.
• The collection of rule for making this decision is called the Replacement Algorithm.
• The cache control-circuit determines whether the requested-word currently exists in the
cache.
2) Write-back protocol.
Write-Back Protocol -This technique is to → update only the cache-location & → mark the
cache-location with associated flag bit called Dirty/Modified Bit. The word in memory will
be updated later, when the marked-block is removed from cache.
During Read-operation
• If the requested-word currently not exists in the cache, then read-miss will occur.
Load–Through Protocol The block of words that contains the requested-word is copied
from the memory intocache. After entire block is loaded into cache, the requested-word is
forwarded to processor.
During Write-operation
• If the requested-word not exists in the cache, then write-miss will occur.
1) If Write Through Protocol is used, the information is written directly into main-memory.
2) If Write Back Protocol is used, → then block containing the addressed word is first
brought into the cache &→ then the desired word in the cache is over-written with the new
information.
MAPPING-FUNCTION
DIRECT MAPPING
• The block-j of the main-memory maps onto block-j modulo-128 of the cache (Figure 8.16).
• When the memory-blocks 0, 128, & 256 are loaded into cache, the block is stored in cache-
block 0. Similarly, memory-blocks 1, 129, 257 are stored in cache-block 1.
2) When more than one memory-block is mapped onto a given cache-block position.
• The contention is resolved by allowing the new blocks to overwrite the currently resident-
block.
2) 7 bit cache-block field -7-bits determine the cache-position in which new block must be
stored.
3) 5 bit Tag field - 5-bits memory-address of block is stored in 5 tag-bits associated with
cache-location.
ASSOCIATIVE MAPPING
• The memory-block can be placed into any cache-block position. (Figure 8.17).
• A new block that has to be brought into the cache has to replace an existing block if the
cache is full.
SET-ASSOCIATIVE MAPPING
• The cache has 2 blocks per set, so the memory-blocks 0, 64, 128…….. 4032 maps into
cache set„0‟.
• The cache can occupy either of the two block position within the set.
6 bit set field - Determines which set of cache contains the desired block.
6 bit tag field -The tag field of the address is compared to the tags of the two blocks of the
set.
• The cache which contains 1 block per set is called direct mapping.
• A cache that has „k‟ blocks per set is called as “k-way set associative cache‟.
• If the main-memory-block is updated by a source & if the block in the source is already
exists in the cache, then the valid-bit will be cleared to “0‟.
• If Processor & DMA uses the same copies of data then it is called as Cache Coherence
Problem.
• Advantages:
1) Contention problem of direct mapping is solved by having few choices for block
placement.