Processor Decoding¶
Overview¶
This lecture begins the build-out of a single-cycle RISC-V processor in Digital (Lab 10 / Project 06). After reviewing the Lab 09 manual-testing workflow, we focus on the three decoder sub-circuits that turn a raw 32-bit instruction word (IW) into something the datapath can act on: the address path from the PC into instruction ROM, the register decoder that extracts rs1/rs2/rd, and the immediate decoder that extracts and sign-extends the per-format immediate values. The recurring theme is that decoding is just wiring: splitters pull bit ranges out of the instruction word, mergers reassemble them in the right order, and a small amount of logic performs sign extension. Building these decoders is the prerequisite for next session's instruction decoder that produces the control signals.
Learning Objectives¶
- Trace the path from the 64-bit PC through address conversion into 32-bit instruction ROM
- Distinguish byte addresses from word addresses and convert between them three different ways
- Explain why a single splitter (bits 9–2 of the PC) replaces a full divide-and-truncate circuit
- Extract the
rs1,rs2, andrdregister fields from an instruction word using a splitter (the Register Decoder) - Identify the bit layout of each RISC-V immediate format (I, S, B, U, J) and why S-type is "scrambled"
- Reassemble an immediate from its scattered bit fields using mergers and sign-extend it to 64 bits
- Design a sign extender from a single MUX driven by the immediate's sign bit
- Combine all five immediate formats into one unified Immediate Decoder selected by an
immSelcontrol input
Prerequisites¶
- RISC-V instruction formats: R, I, S, B, U, J and their bit fields (Lab03, Project04)
- Two's complement representation and sign extension (Project03)
- Bit manipulation: shifts, masks, splitting and merging bit ranges (Lab: Bit Manipulation, Project04)
- Digital components: splitters, mergers, multiplexers, ROM, registers (Lab05–Lab09)
- ROM-based instruction memory and word/byte addressing (Project05)
- The single-cycle processor component list: PC, ROM, RegFile, ALU, MUX (Lab10)
1. From Lab 09 to Lab 10: A Processor Driven by the Instruction Word¶
Lab 09 ended with hands-on testing of individual sequential and combinational components. The big jump in Lab 10 / Project 06 is that we stop driving the components by hand and instead let the instruction word control everything. Every clock cycle the processor:
- Fetches the 32-bit instruction word at the address in the PC,
- Decodes that word into register numbers, an immediate value, and control signals,
- Executes by routing the right data through the ALU (and later memory),
- Writes back a result and advances the PC.
flowchart LR
PC[PC register] --> IM[Instruction Memory ROM]
IM -->|IW 32 bits| DEC[Decoders]
DEC --> RF[Register File]
DEC --> IMM[Immediate]
DEC --> CTL[Control signals]
RF --> ALU[ALU]
IMM --> ALU
CTL --> ALU
ALU --> WB[Write back to RegFile]
WB --> PC
style DEC fill:#f9f,stroke:#333,stroke-width:2px
style IMM fill:#f9f,stroke:#333,stroke-width:2px
Today's focus is the highlighted decode stage. The core components we are wiring up are:
| Component | Role |
|---|---|
| PC | 64-bit register holding the byte address of the current instruction |
| Instruction Memory (ROM) | Holds 32-bit machine-code instructions; addressed by word |
| Register Decoder | Splits IW into rs1, rs2, rd (5 bits each) |
| Immediate Decoder | Extracts and sign-extends the immediate for each format |
| Register File | Reads two registers, writes one, per clock |
| ALU | Performs add/sub/mul/shift; also computes target addresses |
| MUX | Switches an input's source depending on the instruction |
Key idea (the MUX). The whole reason a single circuit can run different instructions is the multiplexer. For an R-type
add, the ALU's B input comes from a register; for an I-typeaddi, the same B input comes from the immediate. A MUX selects between them based on a control signal. Decoding produces both the data (register numbers, immediate) and the control signals that steer these MUXes.
A note on the lab structure: instruction memory in the early Lab 10 circuit is a placeholder — you can manually poke values into the ROM and manually set the register-read inputs to verify the datapath before the decoders and control are fully wired. Lab 10 has you build two processors: one that extends the Lab 09 design with manual control, and one that adds the real instruction decoder, register decoder, and immediate decoder so small programs run automatically.
2. The Instruction Fetch Path: PC to ROM¶
The first decoding problem is purely about addressing. The PC is a 64-bit register, but the instruction ROM in this lab has only 8 address bits, holding 2^8 = 256 instructions. We need to convert the wide PC value into the narrow ROM address.
PC ROM
+------------+ 64 +-----+ 8 +--------------------+ 32
| 64-bit |---/----| ??? |---/----| 32-bit elements |---/--- IW
| Register | +-----+ | 8 address bits |
+------------+ (convert) | 2^8 = 256 entries |
byte addr word addr +--------------------+
The ROM is organized as an array of 32-bit words, indexed 0 .. 255:
The "???" box is the address-conversion logic. Its job is to take a 64-bit byte address and produce an 8-bit word address (or word index) for the ROM.
Why a register holds the PC, not a counter. The PC is a plain 64-bit register so that branch and jump instructions can load an arbitrary computed address into it. On a normal cycle it advances by 4 (one instruction), but it must also be able to jump anywhere.
3. Byte Addresses vs. Word Addresses¶
This is the single most important idea on the fetch path, and it carries over directly from Project05. All addresses in registers are byte addresses — the smallest addressable unit of memory is one byte. But each RISC-V instruction is 4 bytes (32 bits), so consecutive instructions live at byte addresses 0, 4, 8, 12, … while consecutive ROM entries (words) are indexed 0, 1, 2, 3, ….
The relationship is simply division by 4 (equivalently, a right shift by 2):
The diagram below lines up byte addresses against word addresses. Four consecutive bytes form one word:
byte addr word addr
17 | |
16 | | --+
15 | | |
14 | | +-- word 3 (bytes 12,13,14,15)
13 | | |
12 | | --+
11 | | --+
10 | | |
9 | | +-- word 2 (bytes 8,9,10,11)
8 | | --+
7 | | --+
6 | | |
5 | | +-- word 1 (bytes 4,5,6,7)
4 | | --+
3 | | --+
2 | | |
1 | | +-- word 0 (bytes 0,1,2,3)
0 | | --+
So byte address 12 is word index 3 because 12 / 4 = 3, and byte address 8 is word index 2 because 8 / 4 = 2. The lower two bits of a byte address (the byte index within a word) are exactly what >> 2 throws away.
| Byte address (decimal) | Byte address (binary) | >> 2 |
Word index |
|---|---|---|---|
| 0 | ...000000 |
drop low 2 bits | 0 |
| 4 | ...000100 |
drop low 2 bits | 1 |
| 8 | ...001000 |
drop low 2 bits | 2 |
| 12 | ...001100 |
drop low 2 bits | 3 |
| 16 | ...010000 |
drop low 2 bits | 4 |
Because instructions are always 4-byte aligned, the bottom two bits of every instruction's byte address are 00, which is why dividing by 4 is exact and never loses information.
4. Three Ways to Convert the Address¶
The instructor sketched three implementations of the "???" box, from most literal to most elegant. All three feed an 8-bit word address into the ROM's A input. The reason we keep only 8 bits is that the ROM has 256 entries — the higher address bits are always zero for our small programs, so we splice off bits 7–0 after the shift.
Approach 1 — Divide, then take the low 8 bits¶
Use an explicit divider that computes PC / 4, then a splitter to keep bits 7–0 of the 64-bit quotient.
+----+ 64 +-----+ 64 +------+ 8 +-----+
| PC |---/--| DIV |---/--| 7-0 |--/--| ROM |--- IW
+----+ +-----+ |split | | A |
4 -^ +------+ +-----+
This works but a full divider is overkill — we are only ever dividing by a power of two.
Approach 2 — Barrel shift right by 2, then take the low 8 bits¶
Replace the divider with a barrel shifter doing a logical right shift by 2. Shifting right by 2 is exactly divide-by-4 for these aligned addresses.
+----+ 64 +-------------+ 64 +------+ 8 +-----+
| PC |---/--| Barrel Shift|---/--| 7-0 |--/--| ROM |--- IW
+----+ | Right | |split | | A |
2 -^ + +------+ +-----+
Cheaper than a divider, but still more hardware than necessary.
Approach 3 (preferred) — A single splitter: take bits 9–2¶
The cleanest solution drops the arithmetic entirely. Right-shifting by 2 and then keeping the bottom 8 bits is identical to simply selecting bits 9–2 of the PC directly with one splitter:
+----+ 64 +-------+ 8 +-----+
| PC |---/--| 9-2 |--/--| ROM |--- IW
+----+ |split | | A |
+-------+ +-----+
Why bits 9–2? Shifting right by 2 deletes bits 1–0 (the byte index, marked XX below). What's left of an 8-bit word address are PC bits 2 through 9:
addr bits: 10 9 8 7 6 5 4 3 2 1 0
[ | | | | | | | | XX XX] <- bits 1-0 dropped (byte index)
\________ 8-bit word addr _______/
(PC bits 9..2)
The two crossed-out bits are the byte offset within a word. The eight bits above them (PC 9..2) are the 8-bit ROM word address. No divider, no shifter — just a splitter that wires PC[9:2] to ROM[7:0].
This is the same trick used in Project05: word addressing is "free" if you select the right bit range, because addressing by word is shifting the byte address right by
log2(bytes-per-word).
5. Multi-ROM Instruction Memory¶
For a real program we may have more than 256 instructions, or we may want to select among several test programs. The lab packages instruction memory as a sub-circuit ("Instruction Memory") that hides several ROMs behind one address interface plus a program-select (PROG / PN) input.
PROG (program number)
|
+---------------------- Instruction Memory ----------------------+
| PN |
+----+ | +----------+ |
| PC |-+--| ADDR | +-----+ |
+----+ | | | | ROM |--+ |
9-2 | | | +---| ROM |--+--> MUX --- IW |
(split)| | | +---| ROM |--+ (select |
| | |--------| | by PROG) |
| +----------+ +-----+ |
+---------------------------------------------------------------+
- The shared
ADDRinput is the 8-bit word address from the PC splitter (Section 4). - The
PN/PROGinput picks which ROM (which program) supplies the instruction word. - The selected ROM's 32-bit output is the instruction word
IWthat drives the rest of the processor.
During simulation you press play, choose the PROG value to pick a program, then toggle the processor's EN input to begin execution. This is why each program gets its own ROM image generated by the provided makerom3.py script.
6. The Register Decoder (RegDecoder)¶
Once we have the 32-bit instruction word, the Register Decoder pulls out the three 5-bit register fields. RISC-V places these fields in fixed positions across the R-, I-, S-, and B-type formats, which is exactly what makes a single splitter sufficient.
+-------------+ 5
IW 32 | rs1 |--/-- ReadReg0 (RR0)
----/--| rs2 |--/-- ReadReg1 (RR1)
| rd |--/-- WriteReg (WR)
+-------------+ 5
RegDecoder 5
The field positions (from the RISC-V encoding) are:
| Field | Bits | Meaning |
|---|---|---|
rd |
11–7 | destination register (where the result is written) |
rs1 |
19–15 | first source register |
rs2 |
24–20 | second source register |
So the decoder is one splitter configured to break IW into the relevant ranges:
+------------------+ 11-7 -> rd
IW 32 | |
----/------>| splitter | 19-15 -> rs1
| |
+------------------+ 24-20 -> rs2
These outputs wire straight into the Register File:
rs1→ReadReg0(RR0), producingRD0rs2→ReadReg1(RR1), producingRD1rd→WriteReg(WR), the write destination
Worked example¶
Consider add t0, t1, t2, which in ABI terms is add x5, x6, x7. With write-enable (WR/RFW) high, the register file reads x6 and x7, the ALU adds them, and the result is written to x5.
add x5, x6, x7 => rd=5 (x5), rs1=6 (x6), rs2=7 (x7)
Register File:
ReadReg0 = 6 -> RD0 = value of x6 ----+
ReadReg1 = 7 -> RD1 = value of x7 ----+--> ALU add --> WriteData
WriteReg = 5 -> result written to x5 <--/ (when WriteEn / RFW = 1)
Notice the destination field is the same bit range (11–7) for every format that has a destination, and the source-register fields never move. That regularity is a deliberate RISC-V design choice: the decoder can wire all three fields unconditionally, and the instruction decoder later decides (via RFW) whether the write actually happens.
7. RISC-V Immediate Formats: A Quick Reference¶
Immediates are harder than register numbers because RISC-V scatters immediate bits across the instruction word, and the scattering differs per format. This was a deliberate trade-off in the ISA: the source-register bits (rs1, rs2) and the sign bit (bit 31) stay in fixed positions across formats so the hardware can start reading registers and sign-extending before it finishes decoding the opcode. The price is that the rest of the immediate bits are permuted, and the decoder must reassemble them.
R-type | funct7 | rs2 | rs1 |fn3| rd | opcode | (no immediate)
I-type | imm[11:0] | rs1 |fn3| rd | opcode |
S-type | imm[11:5] | rs2 | rs1 |fn3|imm[4:0]| opcode |
B-type |imm[12|10:5]| rs2 | rs1 |fn3|imm[4:1|11]| opcode |
U-type | imm[31:12] | rd | opcode |
J-type | imm[20|10:1|11|19:12] | rd | opcode |
How each format's immediate is built (all sign-extended to 64 bits, except U which is shifted to the upper bits):
| Format | Used by | Immediate assembled from |
|---|---|---|
| I | addi, lw, ld, jalr |
bits 31–20 = imm[11:0] |
| S | sb, sw, sd |
bits 31–25 = imm[11:5], bits 11–7 = imm[4:0] |
| B | beq, bne, blt, bge |
bit 31 = imm[12], bit 7 = imm[11], bits 30–25 = imm[10:5], bits 11–8 = imm[4:1], bit 0 = 0 |
| U | lui, auipc |
bits 31–12 = imm[31:12], low 12 bits = 0 |
| J | jal |
bit 31 = imm[20], bits 19–12 = imm[19:12], bit 20 = imm[11], bits 30–21 = imm[10:1], bit 0 = 0 |
Why I- and J-type matter first. The first programs we run use only
addi(I-type, sinceliis a pseudo-instruction foraddi) andjal/jalr(J- and I-type). So the early Immediate Decoder really only needsimm-Iandimm-J. The S-, B-, and U-type immediates come online when we add stores and conditional branches.
8. The Immediate Decoder (ImmDecoder)¶
The Immediate Decoder takes the 32-bit instruction word and produces one 64-bit, sign-extended output per immediate format:
+-------------------+
IW 32 | IW imm_I |--/64
----/--| imm_S |--/64
| imm_B |--/64
| imm_U |--/64
| imm_J |--/64
+-------------------+
ImmDecoder
Each output is computed by a small sub-circuit that (1) splits the relevant bits out of IW, (2) merges them into the correct immediate bit order, and (3) sign-extends to 64 bits. Building five separate full decoders is wasteful when only one immediate is ever needed at a time, so the recommended design is a unified decoder with an immSel input that selects which format to produce on a single shared imm output:
immSel (3 bits)
|
+-------------------+----------+
IW 32 | I --+ |
----/--| S --| |
| B --| MUX --- imm --/64
| U --| |
| J --+ |
+------------------------------+
ImmDecoder (unified)
Inside, all five immediate-construction circuits still exist, but their results feed a 5-input MUX. The 3-bit immSel (supplied later by the instruction decoder) chooses which one reaches the imm output. This is the same "compute all, select one" pattern used inside the ALU, and it keeps the rest of the datapath simpler: there is a single 64-bit immediate wire instead of five.
In the top-level processor, imm connects to the ALUSrcB MUX — the multiplexer that decides whether the ALU's B operand comes from a register (RD1) or from the immediate.
9. Constructing the S-Type Immediate (Worked Example)¶
The S-type immediate is the clearest illustration of "scrambling." Stores like sd a0, 8(sp) need a 12-bit signed offset, but the encoding splits it into two pieces so that rs1/rs2 stay in their fixed positions.
The two pieces inside the instruction word:
Goal: construct a 64-bit signed value. We extract both pieces, merge them in the correct order to form the 12-bit immediate (high part on top, low part on the bottom), then sign-extend the 12-bit value to 64 bits:
(imm[4:0])
IW 32 bits 11-7 ----------+ 4-0
----| | 12 12 64 64
| (split) +--> MERGE ---/--> SignEx ---/---> imm_S
| bits 31-25 ------+ 11-5 (12-bit) (64-bit)
(imm[11:5])
Step by step:
- Split bits 11–7 out of IW → these become
imm[4:0]. - Split bits 31–25 out of IW → these become
imm[11:5]. - Merge the two into a single 12-bit value with
imm[11:5]in the high position andimm[4:0]in the low position. - Sign-extend the 12-bit two's-complement value to a 64-bit value.
The result is the byte offset for the store, ready to feed into the ALU (address = rs1 + imm_S). The I-type immediate is simpler — it is one contiguous field (bits 31–20) — so it skips the merge step and goes straight from split to sign-extend. B- and J-type need additional shuffling because their lowest bit is an implicit 0 (branch/jump targets are 2-byte aligned).
10. Building the Sign Extender¶
Sign extension turns an n-bit two's-complement number into a wider value with the same numeric meaning, by replicating the sign bit (bit n-1) into all the higher bits. For our 12-bit immediates, bit 11 is the sign bit.
12-bit value: [ S | ... | ... ] bit 11 = sign
11 0
64-bit sign-extended: [ S | S....S | S | ...12 bits... ]
63 12 11 0
If bit 11 is 0 (positive), bits 63–12 should all be 0. If bit 11 is 1 (negative), bits 63–12 should all be 1. The classic shift-based trick from Project04 ((v << 52) >> 52 with an arithmetic shift) works in C, but in hardware there's an even simpler structure: a MUX driven by the sign bit.
12-bit +--------------- bits 11..0 (low 12) ----------------+
imm --+--------| |
| | 52-bit upper |
| +-----------+ 1 fill |
+--| bit 11-11 |---+ (sign bit = MUX selector) |
+-----------+ | |
v v
0 (52 bits) ---> | 0 | 12 MERGE/concat 64
| MUX |--- 52-bit upper ---/--> +-----------> SignEx --/--> imm
-1 (52 bits) ---> | 1 | fill |
+-----+ +-- low 12 bits
How it works:
- The 52-bit upper-fill MUX has two constant inputs:
0(all zeros, 52 bits) on input 0 and-1(all ones, 52 bits) on input 1. - Its selector is the immediate's sign bit, extracted by a
bit 11-11splitter. - When the sign bit is
0, the MUX outputs 52 zeros; when it is1, it outputs 52 ones. - A merger concatenates this 52-bit upper fill with the original 12-bit immediate (in the low 12 positions) to form the 64-bit result.
sign bit (IW bit 11) | upper 52 bits | result is...
-----------------------+---------------+-------------------------
0 | 0000...0 | positive value preserved
1 | 1111...1 | negative value preserved
Concretely, the 12-bit immediate 0b1111_1111_1100 is -4. Its sign bit is 1, so the MUX fills bits 63–12 with ones, giving the 64-bit value 0xFFFF...FFFC = -4. The 12-bit immediate 0b0000_0000_1000 is +8; sign bit 0, upper bits filled with zeros, giving 0x0000...0008 = +8.
Generalizing. The same pattern sign-extends any width: pick off the top bit as the selector, MUX between an all-zeros and an all-ones constant of the right width for the upper portion, and merge it onto the low bits. This single building block is reused across all the immediate formats and later for sub-word loads (
lw,lb).
11. Putting the Decoders on the Datapath¶
Pulling Sections 6, 8, and 9 together, here is how the three decoders feed the rest of the processor for the very first program we can run:
first_s:
li a0, 1 # addi a0, x0, 1 (I-type)
li a1, 2 # addi a1, x0, 2 (I-type)
add a2, a0, a1 # R-type
unimp # end marker
flowchart LR
IW["IW (32 bits)"] --> RD["RegDecoder"]
IW --> ID["ImmDecoder"]
IW --> IDEC["InstDecoder (next lecture)"]
RD -->|rs1| RR0["RegFile RR0"]
RD -->|rs2| RR1["RegFile RR1"]
RD -->|rd| WR["RegFile WR"]
ID -->|imm| MUXB["ALUSrcB MUX"]
RR1 -->|RD1| MUXB
IDEC -->|RFW| WE["RegFile WriteEn"]
IDEC -->|ALUOp| ALU
IDEC -->|ALUSrcB| MUXB
RR0 -->|RD0| ALU["ALU"]
MUXB --> ALU
ALU -->|result| WD["RegFile WriteData"]
- The RegDecoder outputs (
rs1,rs2,rd) wire to the register file'sRR0,RR1, andWR. - The ImmDecoder output (
imm) wires to theALUSrcBMUX so the ALU's B input can be an immediate. - The InstDecoder (built next session) produces the control signals:
RFW(register-file write enable),ALUOp, andALUSrcB(the MUX selector).
For addi a0, x0, 1: rs1 = x0 (so RD0 = 0), imm-I = 1, ALUSrcB = 1 (pick immediate), ALUOp = add, RFW = 1 → the ALU computes 0 + 1 = 1 and writes it to a0. For add a2, a0, a1: ALUSrcB = 0 (pick RD1), and the ALU adds the two register values. The same hardware runs both because the decoders + MUX reconfigure the datapath each cycle.
Key Concepts¶
| Concept | Definition | Example |
|---|---|---|
| Instruction Word (IW) | The 32-bit machine-code value fetched from instruction memory | add x5,x6,x7 → a specific 32-bit pattern |
| Byte address | Address that counts individual bytes; what registers hold | PC = 12 means byte 12 |
| Word address | Index that counts 4-byte words; what the ROM uses | byte 12 → word 3 |
| Address conversion | addr_word = addr_byte >> 2 (drop low 2 bits) |
take PC bits 9–2 |
| Splitter | Digital component that extracts a bit range from a wire | bits 11–7 → rd |
| Merger | Digital component that concatenates bit ranges into one wire | imm[11:5] ++ imm[4:0] |
| Register Decoder | One splitter extracting rs1, rs2, rd from IW |
rd=11–7, rs1=19–15, rs2=24–20 |
| Immediate Decoder | Extracts + sign-extends the immediate for each format | I, S, B, U, J outputs |
| Sign extension | Replicating the sign bit into the upper bits | 12-bit -4 → 64-bit -4 |
| immSel | Control input selecting which immediate format the decoder outputs | 3-bit MUX selector |
| ALUSrcB MUX | Selects whether ALU B input is a register value or an immediate | addi → immediate; add → register |
Practice Problems¶
Problem 1: Byte to Word Address¶
The PC holds the byte address 0x2C (44 decimal). What 8-bit word address is sent to the ROM, and which bits of the PC carry it?
Click to reveal solution
Byte address: 0x2C = 44 = 0b0010_1100
Word address: 44 / 4 = 11 (equivalently 44 >> 2)
In binary: 0b0010_1100 >> 2 = 0b0000_1011 = 11
The word address comes from PC bits 9..2:
PC: ... 0 0 1 0 1 1 0 0
bit#: 9 8 7 6 5 4 3 2 1 0
^low 2 dropped
bits 9..2 = 0010_1011 = 0x0B = 11
So the splitter wires PC[9:2] to ROM A[7:0], giving word index 11.
Problem 2: Why Bits 9–2 and not 7–0?¶
A student wires PC bits 7–0 directly into the 8-bit ROM address. Why is this wrong, and what does it actually compute?
Click to reveal solution
Wiring PC[7:0] sends the **byte address** (truncated to 8 bits) to a ROM that expects a **word address**. Because instructions are 4 bytes apart, only every 4th byte address (0, 4, 8, …) is a valid instruction start. Using bits 7–0: - byte 0 → ROM entry 0 (correct by luck) - byte 4 → ROM entry 4 (should be entry 1!) - byte 8 → ROM entry 8 (should be entry 2!) The processor would skip three out of every four ROM entries and read garbage. You must first divide by 4. Selecting bits **9–2** performs that divide-by-4 (right shift by 2) for free, so PC[9:2] is the correct 8-bit word index.Problem 3: Extract the Register Fields¶
For an instruction word whose relevant fields are rd = 0b01010, rs1 = 0b00110, rs2 = 0b00111, which RISC-V add instruction is this, and which register-file ports get which values?
Click to reveal solution
rd = 0b01010 = 10 = x10 = a0
rs1 = 0b00110 = 6 = x6 = t1
rs2 = 0b00111 = 7 = x7 = t2
Instruction: add a0, t1, t2 (i.e. add x10, x6, x7)
Register File wiring (from the RegDecoder splitter):
ReadReg0 (RR0) = rs1 = 6 -> RD0 = value of t1
ReadReg1 (RR1) = rs2 = 7 -> RD1 = value of t2
WriteReg (WR) = rd = 10 -> destination a0 (written when RFW = 1)
Problem 4: Build the S-Type Immediate¶
A store instruction has imm[11:5] = 0b1111111 (bits 31–25) and imm[4:0] = 0b11000 (bits 11–7). What is the final 64-bit sign-extended immediate value?
Click to reveal solution
Step 1 - merge the two pieces (high part on top):
imm[11:5] = 1111111
imm[4:0] = 11000
12-bit imm = 1111111_11000 = 0b1111_1111_1000
Step 2 - interpret as 12-bit two's complement:
Sign bit (bit 11) = 1 -> negative
0b1111_1111_1000 = -(0b0000_0000_1000) = -8
Step 3 - sign-extend to 64 bits:
Upper 52 bits filled with 1 (because sign bit = 1)
Result = 0xFFFF_FFFF_FFFF_FFF8 = -8
Problem 5: Sign Extender by MUX¶
Using the MUX-based sign extender from Section 10, what 64-bit value results from the 12-bit immediate 0b0000_0110_0100? Show the MUX selection.
Click to reveal solution
Because the sign bit is 0, the upper-fill MUX outputs zeros, preserving the positive value. If bit 11 had been 1, the MUX would output 52 ones and the value would be negative.Problem 6: Unified vs. Separate Immediate Decoders¶
The lecture recommends one unified Immediate Decoder with an immSel MUX instead of five separate decoders feeding five wires across the datapath. Give two reasons this is better.
Click to reveal solution
1. **Only one immediate is ever needed per cycle.** An instruction is exactly one format, so producing all five immediates and routing all five everywhere is wasteful. The `immSel` MUX selects the single relevant one at the source. 2. **Simpler datapath wiring.** With a unified output there is one 64-bit `imm` wire to route to the `ALUSrcB` MUX, instead of five wires plus selection logic scattered downstream. Selection happens once, inside the decoder. Additional benefits: the instruction decoder already knows the format, so it can drive `immSel` directly; and adding a new format is a local change (add one MUX input) rather than a new datapath wire. This mirrors the ALU's "compute all operations, MUX-select the result" pattern.Further Reading¶
- Processor Design Part 1 — PC, instruction memory, register file, and ALU
- Processor Design Part 2 — RegDecoder, ImmDecoder, and the InstDecoder spreadsheet methodology
- Processor Design Part 3 — branch unit, data memory, dashboard, and debugging tips
- Project 06: RISC-V Processor Implementation
- RISC-V Assembly Guides
- The RISC-V Instruction Set Manual, Volume I — instruction formats and immediate encodings
- Source notes: "/notes/CS315-01 2025-10-30 Processor Decoding.pdf"
Summary¶
-
Lab 10 / Project 06 builds a processor controlled by the instruction word. Each cycle the IW is fetched, decoded into register numbers + immediate + control signals, executed, and written back. MUXes let one circuit run many different instructions.
-
The fetch path converts a 64-bit PC byte address into an 8-bit ROM word address. Because instructions are 4 bytes wide,
addr_word = addr_byte >> 2. -
The elegant address conversion is a single splitter taking PC bits 9–2 — equivalent to dividing by 4 and keeping the low 8 bits, but with no divider or shifter hardware.
-
The Register Decoder is one splitter that extracts
rd(bits 11–7),rs1(bits 19–15), andrs2(bits 24–20), feeding the register file'sWR,RR0, andRR1. -
RISC-V scatters immediate bits across the instruction word, differently for each format (I, S, B, U, J), keeping the sign bit and source registers in fixed positions for fast hardware.
-
The Immediate Decoder splits, merges, and sign-extends each immediate; S-type, for example, merges bits 31–25 (
imm[11:5]) with bits 11–7 (imm[4:0]) before sign-extension. -
Sign extension is implemented with a MUX whose selector is the immediate's sign bit, choosing between an all-zeros and an all-ones upper fill that is merged onto the low bits.
-
A unified Immediate Decoder with an
immSelMUX produces a single 64-bitimmoutput, which drives theALUSrcBMUX — setting up next session's instruction decoder and control-signal generation.