0% found this document useful (0 votes)
17 views

Yi 2009

This document describes the design of an instruction fetch module for a 32-bit RISC CPU based on the MIPS instruction set. It analyzes the MIPS instruction formats, instruction data path, decoder module functions, and design theory. It then details the design of an instruction fetch module, including its main functions of fetching instructions, address arithmetic, instruction validity checking, and synchronous control. The module is implemented using pipelining and simulated successfully on QuartusII.

Uploaded by

Tân Giang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Yi 2009

This document describes the design of an instruction fetch module for a 32-bit RISC CPU based on the MIPS instruction set. It analyzes the MIPS instruction formats, instruction data path, decoder module functions, and design theory. It then details the design of an instruction fetch module, including its main functions of fetching instructions, address arithmetic, instruction validity checking, and synchronous control. The module is implemented using pipelining and simulated successfully on QuartusII.

Uploaded by

Tân Giang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2009 International Joint Conference on Artificial Intelligence

32-bit RISC CPU Based on MIPS


Instruction Fetch Module Design

Kui YI Yue-Hua DING


Department of Computer Science and Information Department of Computer Science and Information
Engineer, WuHan Polytechnic University Engineer, WuHan Polytechnic University
Wuhan, HuBei Province 430023, China Wuhan, HuBei Province 430023, China
Email: [email protected] Email: [email protected]

Abstract—In this paper, we analyze MIPS instruction format、、 of MIPS instruction directly extends new instruction based
instruction data path、、 decoder module function and design on old instruction but not abnegates any old instruction, so
theory basend on RISC CPUT instruction set. Furthermore, MIPS processor of 64-bit instruction set can execute 32-bit
we design instruction fetch(IF) module of 32-bit CPU based on instruction.
RISC CPU instruction set. Function of IF module mainly All MIPS instructions are all 32-bit specified instruction
includes fetch instruction and latch module 、 address and instruction address is word justification. MIPS divides
arithmetic module、 、 check validity of instruction module、 、 instructions into three formats: immediate format(I-
synchronous control module. Function of IF modules are Format ) 、 register format(R-Format) and jump format(J-
implemented by pipeline and simulated successfully on Format)[2]. Three instruction format shows as Figure. 1.
QuartusII。。 Meaning of every instruction field as following:
OP: 6-bit operation code;
Keywords- MIPS, Data Flow, Data Path, Pipeline rs: 5-bit source register;
rt: 5-bit temporary (source/destination)register number or
I. INTRODUCTION branch condition;
immediate: 16-bit immediate, branch instruction offset or
Because memory was expensive in old days, designer of address offset;
instruction enhanced complication of instruction to reduce destination: 26-bit destination address of unconditional jump;
program length. Tendency of complication instruction design rd: 5-bit destination register number;
brought up one traditional instruction design style, which is shamt: 5-bit shift offset;
named “Complex Instruction Set Computer-CISC” structure. funct: 6-bit function field;
But great disparity among instructions and low universal
property result in instruction realization difficulty and long
running-time cost. Comparing to CISC, RISC CPU have
more advantages, such as faster speed 、 simplified
structure、easier implementation. RISC CPU is extensive
use in embedded system. Developing CPU with RISC
structure is necessary choice. Figure. 1 MIPS Instruction Format.

II. MIPS INSTRUCTION SET


MIPS instruction decoder or MIPS instruction execution
A. MIPS Processor is very high performance because of three type format with
Full name of MIPS is microcomputer without interlocked given length. Several simple MIPS instructions can
pipeline stages. Another informal full name is Millions of accomplish complicated operation by complier [3].
instructions per second. MIPS has already been pronoun of
III. DATA FLOW
MIPS instruction set and MIPS instruction set architecture
[1]. Data flow is determined by hardware data path, which
express data flow process. There is no clear difference
between data and control. Operation code 、 operand 、
B. MIPS Instruction Set
memory address and value、register address and value、
ISA(Instruction Set Architecture) of processor is jump destination address and content are usually included in
composed of instruction set and corresponding registers. data, but control composes of control signal of unit、time
Program based on same ISA can run on the same instruction sequence control signal and interrupt control signal, and
set. MIPS instruction has been developed from 32-bit MIPSI these signals are not always defined clearly and strictly.
to 64-bit MIPSIII and MIPSIV since it was created. To
assure downward compatibility, every generation production

978-0-7695-3615-6/09 $25.00 © 2009 IEEE 754


DOI 10.1109/JCAI.2009.158
A. R-Format Data Path
In R-Format data path, fetch instruction from memory Format includes ADDI R1, R2, data6 instruction ,SUBI
and analyze instruction into different parts. Two register R1, R2, data6 instruction etc.
specified by instruction fetch data from register file and ALU When ADDI R1, R2, data6 instruction executes, PC
execute instruction command. Finally, after ALU outputs fetches ADDI R1, R2 data6 instruction from memory and
answer write the answer to register file. Figure. 2 shows R- register R2 value is put to ALU. At the same time,
Format data path. immediate data6 is extended to 32-bit signed number and put
For example, ADD R1,R2,R3 instruction, which is add to ALU Finally, after ALU completes add of the two
signed word instruction(R1= R2+ R3). Data flow of this operands, ALU writes back answer to R1 register. The
instruction shows as following: PC fetches ADD R1,R2,R3 difference data flow between SUB R1,R2 data6 instruction
instruction from memory. At first, the instruction access two and ADD R1,R2,data6 instruction is that the former
registers R2 and R3 and value of the two register is put to instruction do subtration.
ALU. After arithmetic is over, ALU write back result to R1
register. And then, data flow is in the end. C. Load Word Data Path
Load word data path is similar to I-Format data path. The
difference between the two data path is that result is written
to memory in load word data but result is written to register
in I-Format. In load word data path, fetch data from memory
and load it to register file. Load word data path shows as
Figure. 4.
LW R1, R2, data6 instruction is the only one instruction
in load word data path. It works shows as below: PC fetch
LW R1, R2, data6 instruction from memory. R1 register is to
load data. Firstly send R2 register value to ALU, at the same
Figure. 2 R-Format Instruction Data Path
time, extend data6 immediate to 32-bit and send it to ALU.
The answer of adding the two numbers is memory address,
For another example, SRL R1,R2,R3 instruction, which And then, copy content of the memory address to R1 register.
is shift word right logical instruction. Data flow of this
instruction shows as below: PC fetches SRL R1,R2,R3
instruction from memory. At first, the instruction access two
register R2 and R3 and value of the two register is put to
ALU. After arithmetic is over, ALU write back answer to R1
register, And then, data flow is in the end.

B. RI-Format Data Path


RI-Format instruction is similar to R-Format
instruction[4]. The difference between them is that the
second read register of R-format instruction is replaced by
immediate of RI-Format instruction. The immediate is 32-bit
Figure. 4 Load Word Data Path
signed number which is extend by 20-bit number, and put to
ALU as the second operand. Finally, write-back result to
register file. RI-Format data path shows as Figure. 3.

D. Memory Word Data Path


Memory word data path is similar to load word data path,
but target which register is to write is memory but not
register file.
There is only SW R1, R2, data6 instruction in load word
instruction. PC fetches SW R1, R2, data6 instruction from
memory. R1 register stores data which is to be stored. Firstly,
send R2 register value to ALU, at the same time, extend
data6 immediate to 32-bit and send it to ALU. The result of
adding the two numbers is memory address. Memory
instruction data path shows as Figure. 5.
Figure. 3 RI-Format Instruction Data Path

755
IV. PIPELINE DESIGN
Pipeline decomposition enhances throughput rate of
instruction. Clock cycle is decided by the slowest stage
running time. In general words, pipeline includes five stages:
instruction fetch(IF) 、 instruction decoder( ID ) 、
execution( EXE )、 memory/ IO(MEM)、write-back(WB).
A. Instruction Fetch( IF )
Instruction fetch (IF) stage is request for instruction
which is fetched from memory. Main component of IF stage
shows as Figure. 7. Instruction and PC is memorized in
IF/ID pipeline register as temporary memory for next clock
Figure. 5 Memory Instruction Data Path
cycle.

E. Register Jump Data Path


In register jump data path, one register compares to 0.
When jump instruction is jump if zero instruction and
register value is 0, the second register value loads to program
counter. When jump instruction is jump if zero instruction
and value in register is not 0, the next program counter value
is loaded and instruction execution continues. Jump if not
zero instruction is similar. Figure. 6 shows jump instruction
data path.

Figure. 7 IF Stage

IF stage mainly depends on program counter(PC) current


value. CPU fetches instruction from ROM based on PC
value and PC adds 1 automatically. Finally, send all these
information to IF/ID pipeline register to decoder.
B. Instruction Decoder( ID )
ID stage sends control command to other units of
processor based on decode of instruction. Figure. 8 shows ID
Figure. 6 Jump Instruction Data Path stage structure. Instruction is sent to control unit and decoded
here. Read register fetches data from register file. Branch
unit is also included in ID stage.
Register jump instruction includes two instructions: BZ Input of ID stage is from IF stage. ID stage decodes
R1, R2 instruction and BNZ R1, R2 instruction. instruction to control signals and prepared operand. For
BZ R1, R2 instruction expresses if it is equal to constant example, if instruction is I-Format instruction, extend
0 jump. Program counter fetches BZ R1, R2 instruction from immediate to 32-bit data and access register file. If
memory, and instruction accesses R1 register and R2 register. instruction is J-Format instruction, EXE stage comes after
And then, send value of the two registers to branch unit. branch unit process completes.
Branch unit judges whether R1 value is equal to 0. If R1
value is equal 0, send value of register R2 to program
counter. If R1 value is not equal 0, PC adds 1 and program
continues executing orderly.
BNZ R1, R2 instruction expresses if it is not equal to
constant 0 then jump. Program counter fetches BNZ R1, R2
instruction from memory, and instruction accesses R1
register and R2 register. And then, send value of the two
registers to branch unit. Branch unit judges whether R1 value
is equal to 0. If R1 value is not equal 0, send value of register
R2 to program counter. If R1 value is equal 0, PC adds 1 and
program continues executing in sequence.
Figure. 8 ID Stage

756
C. Execution(EXE) instruction memories result in R1 register to make program
run faster. Figure. 11 shows WB unit instruction.
EXE stage executes arithmetic. Main component of EXE
stage is ALU. Arithmetic logic unit and shift-register
compose of ALU. Figure. 9 shows EXE stage structure.
Function of EXE stage is to do operation of instruction, such
as add and subtraction. ALU sends result to EX/MEM
pipeline register before entering MEM stage.

Figure. 11 WB Stage

V. INSTRUCTION FETCH STAGE DESIGN

A. Function Statement
Function of instruction fetch(IF) stage shows as below:
1) Fetch instruction and latch. Fetch instruction from
Figure. 9 EXE Stage
instruction register depending on PC value and send the
instruction to IF/ID pipeline register to latch.

2) Address arithmetic. Based on value of sel[3..0] in


D. Memory and IO (MEM) pcselector, select next value of PC from four address jump
sources. These address jump sources are incPC 、
Function of MEM stage is to fetch data from memory
and store data to memory. Another function is to input data branchPC、retiPC and retPC .
If instruction in WB stage of pipeline is jump instruction or
to processor and output data. If instruction is not memory
successful branch instruction, select branchPC value and
instruction or IO instruction, result is sent to WB stage. destination address of program jump acts as address
MEM stage structure shows as Figure. 10. arithmetic result;
If instruction is not jump instruction or fail branch instruction,
PC adds 1 automatically and points to next instruction in
instruction register;
If instruction is interrupt-return instruction, select retiPC
value;
If instruction is subprogram return instruction, select retPC
value.
3) Check validity of instruction. Check operation code
and function code validity based on definition of instruction
set. If instruction is wrong, an exception is thrown.
4) Synchronous control. Use CLK to control
Figure. 10 MEM Stage
synchronous of external sign..
B. Module and Implementation
Storing data in register is main function after result is IF stage includes five modules: incPC 、 lpm_rom0 、
calculated. Some result may be not stored in RAM definitely, progc、pcselector and ifid. Figure. 12 shows connection of
and some result can be written to register directly. Give an each module.
example, some temporary variable is not memorized in RAM Their function shows as below:
because of low execution efficiency. However, some data incPC: PC adds 1 automatically. PC points to address of next
must be stored in RAM. Memory data in RAM or register instruction;
depending on demands in MEM stage. There is a data copy lpm_rom0: application store program;
in MEM/WB pipeline register. progc: program counter;
pcselector: control next instruction selection;
ifid: pipeline latch.
E. Write-Back (WB) Every module uses VHDL to describe. Input signal of IF
WB stage charges of writing result、store data and input stage includes branchPC 、 retPC 、 retiPC 、 sel 、 clk 、
data to register file. The purpose of WB stage is to write data ifid_flush、ifid_enable and pc_enable. Their function shows
to destination register. For example, ADD R1, R2, R3 as below:

757
branchPC: jump address of branch signal
retPC: subprogram return address signal
retiPC: interrupt return address signal
sel : selection signal from pcselector in EXE stage
clk: clock signal
ifid_flush: data signal
ifid_enable、pc_enable: control signal
Output signal of IF stage includes ins[31..0]、
pcvelue[31..0] 、 insOut[31..0]and pcout[31..0]. Their function
shows as below: Figure. 14 progc Module Simulation Waveform
ins[31..0]: instruction code fetch from instruction register;
pcvelue[31..0]: PC value in IF stage;
insOut[31..0]: instruction code which is to sent to next stage 3) incPC module. Input port includes pcin[31..0] and
and comes from pipeline register ifid;
pcout[31..0]: program counter value.
output port includes pcout[31..0]. The function of incPC
Module Implementation shows as below: module is to PC add 1 and the new PC cat as one optional
1) pcselector module. Input port includes value. When negative clock sign comes, PC value is sent to
nextpc[31..0] 、 branchpc[31..0] 、 retpc[31..0] 、 pcselector module. Figure. 15 shows incPC module entity
retipc[31..0] and sel[3..0]. Output port includes structure and RTL structure. Figure. 16 shows simulation
newpc[31..0]. Select data from four source data as next waveform of incPC module. We can know pcIn value adds 1
instruction address determined by sel[3..0]. The four source and send result to pcVal from waveform.
data are nextpc[31..0]、branchpc[31..0]、retpc[31..0] and
retipc[31..0].
Input signal are nextPC、branchPC、retPC、retiPC
and sel. Output signal are newPC. Function of input signal
shows as below:
nextPC: next instruction address;
branchPC: address of branch jump signal;
retPC: subprogram return address signal;
retiPC: interrupt return address signal;
sel: selector signal.
Time sequence simulation waveform of pcselector shows
as Figure. 13. Input different address sign into nextpc 、
branchpc、retpc、retipc ports, and newpc selects one of the
four input signal to output depending on value in sel[3..0].

Figure. 15 incPC Entity Structure and RTL Structure

Figure. 13 pcselector Stage Simulation Waveform

2) progc module. Input port includes pcin[31..0] 、 clk


and enable. Output port includes pcout[31..0]. The function
of the module is to communicate with instruction memory.
When positive clock edge comes, send value of address bus Figure. 16 incPC Module Simulation Waveform
pcin[31..0] to instruction memory and fetch next instruction
from ins[31..0]. ins[31..0] is output of instruction memory.
4) lpm_rom0 module. Input port includes address[5..0]
Send instruction out when negative clock edge comes.
and inclock. Output port includes q[31..0]. Function of the
Figure. 14 shows progc module simulation waveform.
module is to memory program machine code. Access

758
memory location which is specified by address bus
address[5..0], moreover, fetch next instruction from memory REFERENCE
and send out the instruction by instruction bus q[31..0].
lpm_rom0 module can be implemented EAB of FPGA [1] Bai-ZhongYing, Computer Organization, Science Press, 2000.11.
by calling macro function module. Adopt lpm_rom structure [2] Wang-AiYing, Organization and Structure of Computer, Tsinghua
University Press, 2006.
in macro function library to realize the module. Parameter
[3] Wang-YuanZhen, IBM-PC Macro Asm Program, Huazhong
configuration is that address bus address is 6-bit and output University of Science and Technology Press, 1996.9.
bus q is 32-bit. Process of lpm_rom0 is described as
[4] MIPS Technologies, Inc. MIPS32™ Architecture For Programmers
following: when positive inclock edge comes, latch Volume II: The MIPS32™ Instruction Set,June 9, 2003.
address[5..0] and ouput the data pointed by value of [5] Zheng-WeiMin, Tang-ZhiZhong. Computer System Structure (The
address[5..0] to output port q[31..0]. Set up data in second edition), Tsinghua University Press,2006.
lpm_rom0 by memory initialization file (.mif), or edit 、 [6] Pan-Song, Huang-JiYe, SOPC Technology Utility Tutorial , Tsinghua
update and reload data on debugging by system memory University Press,2006.
editor tool. [7] MIPS32 4KTMProcessor Core Family Software User's Manual ,
5) ifid module. Input port includes pcin[31..0] 、 MIPS Technologies Inc.
insin[31..0] 、 clkid_flush and ifid_enable. Output port [8] Mo-JianKun, Gao-JianSheng,Computer Organization, Huazhong
University of Science and Technology Press, 1996.
includes pcout[31..0] and insout[31..0]. Function of ifid is [9] Zhang-XiuJuan, Chen-XinHua, EDA Design and emulation Practice
to latch PC and instr of Statge1 and send them to next stage. [M]. BeiJing, Engine Industry Press. 2003.
Time sequence simulation waveform of Ifid module [10] "IEEE Standard of Binary Floating-Point Arithmetic" IEEE
shows as Figure. 17. We can see fact that when ifid_enalbe is Standard754, IEEE Computer Society, 1985.
high level and id_flush is low level, data are not relative in [11] Yi-Kui, Ding-YueHua, Application of AMCCS5933 Controller in
pipeline. When positive edge of clk comes, values of insOut PCI BUS, DCABES2007, 2007.7.
and pcOut are same to insIn and pcIn respectively; When
ifid_enable and id_flush are all high level, data is relative in
pipeline. When positive edge of clk comes, insOut changes
to “0000H”, but pcOut maintains its original value ; After
pipeline conflicts, insOut and pcOut returns to normal
working state; if ifid_enable is low level, pipeline stops
working and insOut and pcOut maintain its original state.

Figure. 17 ifid Module Simulation Waveform

VI. CONCLUSION
In this research, we adopt top-down design method and
use VHDL to describe system. At first, we design the system
from the top, and do in-depth design gradually. The structure
and hierarchical of design is very clear. It is easy to edit and
debug. Design of instruction fetch (IF)stage simulates 、
integrate and routes on Quartus II 4.3. The result indicates
IF stage completes prospective function.

759
Figure. 12 IF Circuit Diagram.

760

You might also like