Document

Transcript Document

Single Cycle Datapath Design
CMSC411/Computer Architecture
These slides and all associated material are
© 2003 by J. Six and are available only for
students enrolled in CMSC411.
Life. . .it's all about balls into boxes .
CMSC411 – Computer Architecture / © 2003 J. Six
Use and Distribution Notice
Possession of any of these files implies understanding and
agreement to this policy.
The slides are provided for the use of students enrolled in Jeff
Six's Computer Architecture class (CMSC 411) at the University of
Maryland Baltimore County. They are the creation of Mr. Six and
he reserves all rights as to the slides. These slides are not to be
modified or redistributed in any way. All of these slides may only
be used by students for the purpose of reviewing the material
covered in lecture. Any other use, including but not limited to, the
modification of any slides or the sale of any slides or material, in
whole or in part, is expressly prohibited.
Most of the material in these slides, including the examples, is
derived from Computer Organization and Design, Second Edition.
Credit is hereby given to the authors of this textbook for much of
the content. This content is used here for the purpose of
presenting this material in CMSC 411, which uses this textbook.
CMSC411 – Computer Architecture / © 2003 J. Six
Moving to the Datapath
We have now seen the basics of computer
performance, instruction set architectures, and
computer arithmetic.
To work through the design of a microprocessor
datapath and control unit (the heart and soul of
a microprocessor), we will be designing an
implementation of a subset of the MIPS
instruction set…



Memory Reference: load word (lw) and store word
(sw)
Arithmetic Logic: add, sub, and, or, and slt.
Branching: branch equal (be) and jump (j)
CMSC411 – Computer Architecture / © 2003 J. Six
Clocking
For simplicity, our implementation will assume
an edge-triggered clock methodology – this
means that any values stored in the machine
are only updated on a clock edge.
State elements (registers and memory) update
their internal values on the clock edge –
therefore the combinatorial logic will take its
inputs from a set of state elements and store
its outputs into a set of state elements.
The inputs are values that were previously
written (in a previous clock cycle) and the
outputs are values that can be used in a later
clock cycle.
CMSC411 – Computer Architecture / © 2003 J. Six
Combinatronics, State
Elements, and the Clock
The reliance on state elements shows that the
combinatorial logic, the state elements, and the
clock are closely related.
In this example, all signals must propagate from
state element 1 through the combinatorial logic
into element 2 in the time of one clock cycle.
The time necessary for the signals to reach state
element 2 defines the minimal length of the
clock cycle.
S ta te
e le m e n t
1
C lo c k c y c le
C o m b i n a t io n a l lo g ic
S ta te
e le m e n t
2
CMSC411 – Computer Architecture / © 2003 J. Six
Edge-Triggered Clocking
and Feedback
The edge-triggered clocking
methodology we have chosen allows us
to read the contents of a register, send
the value through some combinatorial
logic, and write the result back to that
same register, all in the same clock
cycle.
State
element
Combinational logic
CMSC411 – Computer Architecture / © 2003 J. Six
Our Datapath, Version 1:
A Single Cycle Implementation
We will begin by building a simple
implementation that uses a single, long, clock
cycle for each instruction.

Each instruction begins execution on one clock edge
and completed execution on the next clock edge.
This implementation is slower than a multicycle
implementation because such an implementation
would allow different instruction classes to take
different number of clock cycles – each of which
could be much shorter (such an implementation
would be more realistic, but require more
complex control – we will build this one later).
CMSC411 – Computer Architecture / © 2003 J. Six
Starting on Our Way
We will first look at all of the major
components that are required to execute each
class of MIPS instruction.



The first element we need is a place to store
instructions. Here, we use an instruction memory
(a state element) that takes an address and returns
the instruction at that location.
The address of the current instruction also needs a
state element, the program counter (PC).
Finally, we need an adder to increment the PC to
the address of the next instruction (this can just be
an ALU with its control wired to always add).
CMSC411 – Computer Architecture / © 2003 J. Six
The First Three Components
So, here’s what we need…
Instruction
address
PC
Instruction
Add Sum
Instruction
memory
a. Instruction memory
b. Program counter
c. Adder
CMSC411 – Computer Architecture / © 2003 J. Six
The First Datapath Section
To execute any instruction, we fetch the
instruction from memory. We also need to
increment the PC to the next instruction, 4 bytes
later.
This section of the datapath is straightforward…
A dd
4
PC
R ea d
a d d re s s
In s tru c tio n
In s tru c tio n
m e m o ry
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type Instructions
Now let’s look at R-type instructions – note
that these read two registers, perform an ALU
operation on the contents, and write the
result to a register.
The 32 registers are stored in a structure
called a register file. This is a collection of
registers in which any register can be read or
written by specifying the number of the
register in the file. This structure is said to
contain the register state of the machine.
In addition to the register file, we will need
an ALU to actually perform the operation.
CMSC411 – Computer Architecture / © 2003 J. Six
The Register File
The R-type instructions have three register
operands – we need to read two registers and
write to one.


For each word we want to read, the register file needs
an input to specify the register number to read.
For each word we want to write, we need two inputs,
the register to write to and the data to be written.
The semantics are different for reads and
writes…


The register file always outputs the contents of whatever
register numbers are on the Read register inputs.
Writes are controlled by the write control signal, which
must be asserted for a write to occur at the clock edge.
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type Datapath Components
So, here’s what we need. Note that the read
data outputs & the write data input of the
register file and the inputs to the ALU are 32
bits wide…
5
Register
numbers
5
5
Data
3
Read
register 1
Read
register 2
Registers
Write
register
Write
data
ALU control
Read
data 1
Data
Zero
ALU ALU
result
Read
data 2
RegWrite
a. Registers
b. ALU
CMSC411 – Computer Architecture / © 2003 J. Six
The R-Type Datapath Design
Here is the datapath for R-type
instructions…
3
Read
register 1
Instruction
Read
register 2
Registers
Write
register
Write
data
ALU operation
Read
data 1
Zero
ALU ALU
result
Read
data 2
RegWrite
CMSC411 – Computer Architecture / © 2003 J. Six
Memory Access Instructions
Now let’s look at memory access instructions, load and
store.
Recall these instructions have the form…
lw $t1, offset($t2) / sw $t1, offset($t2)
They compute a memory address by adding the base
register ($t2) to the 16-bit signed offset field in the
instruction.
If the instruction is a store, the value must be read
from the register file (in $t1). If it is a load, the value
read from memory must be written into the register
file (into $t1).
We need a sign extension unit to take the 16-bit field
from the instruction and use it as a 32-bit input into
the ALU.
CMSC411 – Computer Architecture / © 2003 J. Six
Memory Access Instructions
We also need a data memory – this will need
read and write control signals, an address input,
and an input for the data to be written.
So, we need…
MemWrite
Address
Write
data
Read
data
Data
memory
16
Sign
extend
32
MemRead
a. Data memory unit
b. Sign-extension unit
CMSC411 – Computer Architecture / © 2003 J. Six
The Memory Access Datapath
So the datapath for memory access
instructions looks like…
3
Read
register 1
Instruction
Read
register 2
Registers
Write
register
Write
data
ALU operation
MemWrite
Read
data 1
Zero
ALU
ALU
result
Address
Read
data 2
Write
data
RegWrite
16
Sign
extend
Read
data
Data
memory
32
MemRead
CMSC411 – Computer Architecture / © 2003 J. Six
The Branch Instruction
The beq instruction has three operands, two
registers that are compared for equality and a
16-bit offset that is added to the branch
instruction address to compute the target.
Here, we have to compute the target by adding
that offset to the PC+4.
There are two details we need to keep in mind…


The base for the branch target is the address of the
instruction following the branch (PC+4). Since we
already compute this, we’re OK.
The offset field is shifted left by two bits so that it is
a word offset. We need a shift-left-two unit.
CMSC411 – Computer Architecture / © 2003 J. Six
Branch Determination
In addition to this computation, the hardware
must compare the two registers and
determine the next PC.


If they are equal, the branch is taken and PC =
branch target address.
If not, the branch is not taken and PC=PC+4.
We can use our register file and ALU to make
this happen (we will subtract the registers
and look at the Zero output bit to see if they
are equal).
We will ignore that branch control logic for
now (don’t worry, we’ll design it later).
CMSC411 – Computer Architecture / © 2003 J. Six
The Branch Datapath
Here is the branch datapath…
PC + 4 from instruction datapath
Add Sum
Branch target
Shift
left 2
Instruction
3
Read
register 1
Read
register 2
Registers
Write
register
Write
data
Read
data 1
ALU Zero
Read
data 2
RegWrite
16
ALU operation
Sign
extend
32
To branch
control logic
CMSC411 – Computer Architecture / © 2003 J. Six
Combining These Datapaths
Let’s build a common datapath from the
portions we have constructed.
Our datapath is single cycle – no resource
can be used more than once per instruction.


Anything that is will need to be duplicated.
We need separate data and instruction memories.
Many elements can be shared between two
(or more) instruction classes – however, we
need to allow multiple connections to the
input of an element and have a control signal
select which input is connected.

We need multiplexors!
CMSC411 – Computer Architecture / © 2003 J. Six
Combining Two Datapaths
Let’s combine the R-type and memory access
instruction datapaths.
There are two differences…


The second input to the ALU is a register (R-type)
or the sign-extended lower half of the instruction
(memory).
The value stored into a destination register comes
from either the ALU (R-type) or from memory (for
a load).
So we need to put multiplexors at these two
locations (we will once again ignore the
control signals and generate them later).
Combination
of Two Datapaths
CMSC411 – Computer Architecture / © 2003 J. Six
Here is a datapath formed by combing
the memory and R-type datapaths…
CMSC411 – Computer Architecture / © 2003 J. Six
Adding the Instruction Fetch
We can add the instruction fetch datapath we
started with with almost no effort…
Add
4
PC
Read
address
Instruction
Instruction
memory
Registers
Read
register 1
Read
Read
data
1
register 2
Read
Write
data 2
register
3
MemWrite
MemtoReg
ALUSrc
Write
data
RegWrite
16
ALU operation
Sign 32
extend
M
u
x
Zero
ALU ALU
result
Address
Read
data
Data
memory
Write
data
MemRead
M
u
x
CMSC411 – Computer Architecture / © 2003 J. Six
Our Complete Datapath
Finally, we add in the branch datapath.
This requires one additional multiplexor that
is used to select the next PC.
PCSrc
M
u
x
Add
Add ALU
result
4
Shift
left 2
PC
Read
address
Instruction
Instruction
memory
Registers
Read
register 1
Read
Read
data 1
register 2
Write
register
Write
data
RegWrite
16
ALUSrc
Read
data 2
M
u
x
3
ALU operation
Zero
ALU ALU
result
MemtoReg
Address
Write
data
Sign
extend
MemWrite
Read
data
Data
memory
32
MemRead
M
u
x
CMSC411 – Computer Architecture / © 2003 J. Six
ALU Control Signals
Now that we have designed a complete (yet
simple) datapath, we can design the control
unit – this will take inputs and generate write
signals for each state element and control
signals for each multiplexor.
The ALU control is somewhat different than
the main control, so we will design it first.
Recall out ALU has three control inputs…
ALU Control
Function
ALU Control
Function
000
AND
110
subtract
001
OR
111
010
add
set on less
than
CMSC411 – Computer Architecture / © 2003 J. Six
ALU Operations
Depending on what class of instruction we
are executing, we need the ALU to
perform one of these five functions…



Memory instructions require the ALU to
compute the address by addition.
R-type instructions require one of any of the
five operations, depending on the value of
the 6-bit funct field in the instruction.
Branch equal instructions require subtraction.
CMSC411 – Computer Architecture / © 2003 J. Six
The ALU Control Unit
We can generate the 3-bit ALU control signal by
creating a small control unit that has the
function field of the instruction and a new 2-bit
control field called ALUOp.
ALUOp indicates whether the operation to be
performed should be an addition (ALUOp = 00)
for memory instructions, subtraction (ALUOp =
01) for beq, or determined by the operation
encoded in the function field (10).

We will see how the main control unit generates
ALUOp in a little while.
The ALU control unit will generate the 3-bit
signal that directly controls the ALU.
CMSC411 – Computer Architecture / © 2003 J. Six
Multiple Levels of Decoding
Notice that we are using multiple levels of
instruction decoding.

The main control unit generates ALUOp, which is then
used by the ALU control unit to generate control
signals for the ALU.
This is a common practice and can reduce the
size of the main control unit.
Several smaller control units may also be faster
(sometimes much faster) than one big control
unit that generates all of the control signals
itself.
CMSC411 – Computer Architecture / © 2003 J. Six
ALU Control Signals
Here is a mapping that shows instruction, its
funct field, the ALUOp signal, the desired ALU
operation, and the ALU control signals.
Instruction
funct
ALUOp
Desired Op
ALU Control
LW
XXXXXX
00
add
010
SW
XXXXXX
00
add
010
BEQ
XXXXXX
01
subtract
110
R-type/add
100000
10
add
010
R-type/sub
100010
10
subtract
110
R-type/and
100100
10
and
000
R-type/or
100101
10
or
001
R-type/slt
101010
10
sll
111
CMSC411 – Computer Architecture / © 2003 J. Six
Designing our ALU Control Unit
So our ALU control unit needs to take in
ALUOp and funct and produce the ALU
control signals.
Let’s look at a truth table for this…
AluOp
Funct
ALU Control Signals
Bit 1
Bit 0
F5 F4 F3 F2 F1 F0
0
0
X
X
X
X
X
X
010
X
1
X
X
X
X
X
X
110
1
X
X
X
0
0
0
0
010
1
X
X
X
0
0
1
0
110
1
X
X
X
0
1
0
0
000
1
X
X
X
0
1
0
1
001
1
X
X
X
1
0
1
0
111
CMSC411 – Computer Architecture / © 2003 J. Six
ALU Control Unit Design
Notice that we use don’t care cases as
liberally as possible – this leads to a
more optimized design that uses less
gates and runs faster.
Going from the truth table to actual
logic is very mechanical so we’ll skip
that.
CMSC411 – Computer Architecture / © 2003 J. Six
Main Control Unit Design
Now that we have designed the ALU control
unit, we can return to the main control unit.
To understand how this unit needs to function
and how to connect the various fields of the
instruction to the datapath, let’s review the
instruction classes…
R-type
0
31-26
rs
25-21
rt
20-16
rd
15-11
shamt
10-6
Load/Store
35 or 43
31-26
rs
25-21
rt
20-16
address
15-0
Branch
4
31-26
rs
25-21
rt
20-16
address
15-0
funct
5-0
CMSC411 – Computer Architecture / © 2003 J. Six
Instruction Formats
We can make some observations…





The opcode is always in bits 31-26. We will refer to
this field as Op[5-0].
The two registers to be read are rs and rt, at
positions 25-21 and 20-16. This is true for R-type,
beq, and store instructions.
The base register for load and store instructions is rs,
at positions 25-21.
The 16-bit offset for beq, load, and store instructions
is always in positions 15-0.
The destination register is in one of two places…
 For a load, it is in positions 20-16 (rt).
 For R-type, it is in position 15-11 (rd).
 We need a multiplexor for the destination register.
CMSC411 – Computer Architecture / © 2003 J. Six
Integrating into the Datapath
We can integrate these observations and our
ALU control unit into the datapath…
PCSrc
Add
ALU
Add result
4
Shift
left 2
RegWrite
Instruction [25– 21]
PC
Read
address
Instruction
[31– 0]
Instruction
memory
Instruction [20– 16]
1
M
u
Instruction [15– 11] x
0
RegDst
Instruction [15– 0]
Read
register 1
Read
register 2
Read
data 1
MemWrite
ALUSrc
Read
Write
data 2
register
Write
Registers
data
16
Sign
extend
1
M
u
x
0
1
M
u
x
0
Zero
ALU ALU
result
MemtoReg
Address
Write
data
32
ALU
control
Instruction [5– 0]
ALUOp
Read
data
Data
memory
MemRead
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Control Signal Review
Looking at this design, we have eight control
signals. We already known what ALUOp does so
let’s review what the other control signals do…
Signal Name
Effect when deasserted
Effect when asserted
RegDst
Destination register number for the write
comes from the rt field.
Destination register number for the write
comes from the rd field.
RegWrite
NONE
Register specified is written with the value
on the Write data input.
ALUSrc
Second ALU operand comes from the
second register file output.
Second ALU operand is the sign-extended
lower 16 bits of the instr.
PCSrc
PC = output of adder (PC+4)
PC = output of adder (branch target)
MemRead
NONE
Data memory contents at the specified
address are put on the Read output.
MemWrite
NONE
Data memory contents at the specified
address are replaced with the Write input.
MemToReg
Value fed to Write input comes from the
ALU.
Value fed to Write input comes from the
data memory.
CMSC411 – Computer Architecture / © 2003 J. Six
Generating the Control Signals
So how are these signals set?
Well, the control unit can set all but one
based solely on the opcode field of the
instruction.
The PCSrc control signal is different…


It should be set if the instruction is beq and
the Zero output of the ALU is true.
We can solve this by having the control unit
generate a Branch signal and then ANDing
that with the Zero output from the ALU to
produce the PCSrc control signal..
CMSC411 – Computer Architecture / © 2003 J. Six
Datapath With Control Unit
We can add the control unit to our datapath…
0
M
u
x
ALU
Add result
Add
4
Instruction [31 26]
Control
Instruction [25 21]
PC
Read
address
Instruction
memory
Instruction [15 11]
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
PCSrc
Read
register 1
Instruction [20 16]
Instruction
[31– 0]
1
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Zero
ALU ALU
result
Address
Write
data
Instruction [15 0]
16
Instruction [5 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Tracing Instructions
To illustrate how our datapath works, let’s
trace some instructions through.
Beginning with a R-type instruction, we
will highlight what components of the
datapath are active at each of the four
major steps involved in processing a Rtype instruction.
Remember that although we are doing
this in steps, this is a single clock cycle
datapath – all of this happens on a signal
clock cycle.
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type: Step 1
Step 1 – Fetch instruction from memory and increment the PC.
0
M
u
x
Add
Add
1
Shift
left 2
RegDst
Branch
4
ALU
result
MemRead
Instruction [31– 26]
MemtoReg
Control
ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [25– 21]
PC
Read
register 1
Read
address
Instruction [20– 16]
Instruction
[31– 0]
Instruction
memory
Instruction [15– 11]
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Zero
0
M
u
x
1
Write
data
ALU
ALU
result
Address
Write
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type: Step 2
Step 2 – Read source registers from the register file.
0
M
u
x
Add
Add
1
Shift
left 2
RegDst
Branch
4
ALU
result
MemRead
MemtoReg
Instruction [31– 26]
Control
ALUOp
MemWrite
ALUSrc
RegW rite
Instruction [25– 21]
PC
Read
register 1
Read
address
Instruction [20– 16]
Instruction
[31– 0]
Instruction
memory
Instruction [15– 11]
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Zero
0
M
u
x
1
Write
data
ALU
ALU
result
Address
W rite
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type: Step 3
Step 3 – ALU operating on the register data operands.
0
M
u
x
Add
4
Instruction [31 26]
Control
Instruction [25 21]
PC
Read
address
Instruction
memory
Instruction [15 11]
1
Zero
ALU ALU
result
Address
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Read
register 1
Instruction [20 16]
Instruction
[31– 0]
ALU
Add result
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Write
data
Instruction [15 0]
16
Instruction [5 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
R-Type: Step 4
Step 4 – Writing the result back to a register.
0
M
u
x
Add
4
Instruction [31 26]
Control
Instruction [25 21]
PC
Read
address
Instruction
memory
Instruction [15 11]
1
Zero
ALU ALU
result
Address
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Read
register 1
Instruction [20 16]
Instruction
[31– 0]
ALU
Add result
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Write
data
Instruction [15 0]
16
Instruction [5 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Tracing Memory Instructions
Looking at a load instruction such as
lw $t1, offset($t2), we can see five steps…





Instruction is fetched from memory and the PC is
incremented.
A register ($t2) is read from the register file.
The ALU computes the sum of that value and the
sign-extended, lower 16 bits of the instruction
(offset).
The sum from the ALU is used as the address for the
data memory.
The data from the memory is written into the
register file – the destination is given by bits 20-16
of the instruction ($t1).
CMSC411 – Computer Architecture / © 2003 J. Six
Load Instruction:
All Steps in One
Showing the active components in the datapath for lw…
0
M
u
x
Add
4
Instruction [31– 26]
Control
Instruction [25– 21]
PC
Read
address
Instruction
memory
Instruction [15– 11]
1
Zero
ALU ALU
result
Address
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Read
register 1
Instruction [20– 16]
Instruction
[31– 0]
ALU
Add result
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Write
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Tracing Branch Instructions
Looking at a branch instruction such as
beq $t1,$t2,offset, we can see five steps…




Instruction is fetched from memory and the PC is
incremented.
Two registers ($t1&$t2) are read from the register file.
The ALU sibtracts the data values. PC+4 is added to
the sign-extended lower 16-bits of the instruction
(offset) shifted left by two – the result is the target of
the branch instruction.
The Zero result from the ALU is then used to decide
which result to store into the PC.
CMSC411 – Computer Architecture / © 2003 J. Six
Branch Instruction:
All Steps in One
Showing the active components in the datapath for beq…
0
M
u
x
Add
4
Instruction [31– 26]
Control
Instruction [25– 21]
PC
Read
address
Instruction
memory
Instruction [15– 11]
1
Zero
ALU ALU
result
Address
Shift
left 2
RegDst
Branch
MemRead
MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Read
register 1
Instruction [20– 16]
Instruction
[31– 0]
ALU
Add result
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Write
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Read
data
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Designing the Control Logic
Now that we understand how the control unit
interacts with the datapath, we can design
that logic.
This is simple combinatorial design, using the
opcode (all six bits) as the input and producing
each of the control signals as outputs.
We can build the control function in a nice
table format that shows the inputs and
outputs for each type of instruction (R-type,
lw, sw, and beq).
Once this table has been derived, the
implementation is trivial. So, let’s do it…
CMSC411 – Computer Architecture / © 2003 J. Six
The Control Function
Input/Output?
Inputs
Outputs
Signal Name
R-Type
lw
sw
beq
Op5
0
1
1
0
Op4
0
0
0
0
Op3
0
0
1
0
Op2
0
0
0
1
Op1
0
1
1
0
Op0
0
1
1
0
RegDst
1
0
X
X
ALUSrc
0
1
1
0
MemToReg
0
1
X
X
RegWrite
1
1
0
0
MemRead
0
1
0
0
MemWrite
0
0
1
0
Branch
0
0
0
1
ALUOp1
1
0
0
0
ALUOp0
0
0
0
1
CMSC411 – Computer Architecture / © 2003 J. Six
One More Instruction Type
We have not added support for jump
instructions to our datapath.
Recall that a jump instruction looks a lot like a
branch but is not conditional and computes the
target PC differently…



The low-order 2 bits are always zero (like beq).
The next lower 26 bits come from the 26-bit
immediate field in the instruction.
The upper 4 bits come from the upper four bits of the
PC+4.
Recall the jump instruction looks like…
2
31-26
address
25-0
CMSC411 – Computer Architecture / © 2003 J. Six
The New Datapath
Here is the datapath with support for the jump instruction…
Instruction [25– 0]
26
Shift
left 2
Jump address [31– 0]
28
0
1
M
u
x
M
u
x
ALU
Add result
1
0
Zero
ALU ALU
result
Address
Read
data
PC+4 [31– 28]
Add
4
Instruction [31– 26]
Instruction [25– 21]
PC
Read
address
Instruction
memory
Read
register 1
Instruction [20– 16]
Instruction
[31– 0]
Instruction [15– 11]
Shift
left 2
RegDst
Jump
Branch
MemRead
Control MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
0
M
u
x
1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
0
M
u
x
1
Write
data
Write
data
Instruction [15– 0]
16
Instruction [5– 0]
Sign
extend
32
ALU
control
Data
memory
1
M
u
x
0
CMSC411 – Computer Architecture / © 2003 J. Six
Performance Problems
with Single Cycle Datapaths
We now have a complete single cycle datapath
that will correctly implement the MIPS
instruction set (well, the subset we have chosen
to implement).
Why is this approach not used?



The clock cycle is determined by the longest path
through the datapath.
This should be a load instruction, as it uses five units.
Several other instruction classes could run in a lot less
time than this!
So yes, everything runs in one clock cycle – but
that cycle must be long enough for the longest
instruction to complete!
Moving to a
Multicycle Datapath
CMSC411 – Computer Architecture / © 2003 J. Six
In addition to the problem with inefficient
performance, the single cycle datapath suffers
from the fact that each functional unit can only
be used once in each clock cycle.
This is why we needed two separate memory
units (among other less-than-optimal design
characteristics).
So, while it is nice and simple, the single cycle
datapath is not used. We can solve this
problem by have a much shorter clock cycle
and having each instruction use multiple cycles.
That is a multicycle datapath and that’s next. 

Document

Transcript Document

Directory