← Back to Course
# Processor Decoding ## CS 315 Computer Architecture --- ## Overview **Lab 10 / Project 06**: build a single-cycle RISC-V processor driven by the instruction word. Today we wire up the **three decoders** that turn a raw 32-bit instruction word (IW) into datapath inputs: 1. **Fetch path** — PC byte address → ROM word address 2. **Register Decoder** — extract `rs1`, `rs2`, `rd` 3. **Immediate Decoder** — extract and sign-extend per-format immediates
Key theme:
decoding is wiring
— splitters, mergers, and a small MUX for sign extension.
--- ## The Fetch-Decode-Execute Loop
flowchart LR PC["PC\n(64-bit)"] --> IM["Instruction\nMemory ROM"] IM -->|"IW (32 bits)"| DEC["Decoders"] DEC --> RF["Register File"] DEC --> IMM["Immediate"] DEC --> CTL["Control signals"] RF --> ALU["ALU"] IMM --> ALU CTL --> ALU ALU --> WB["Write back\nto RegFile"] WB --> PC
Each cycle: **Fetch** → **Decode** → **Execute** → **Write back** --- ## Why Decoding Matters | Component | Role | |-----------|------| | **PC** | 64-bit register holding byte address of current instruction | | **Instruction Memory** | 32-bit words; addressed by word index | | **Register Decoder** | Splits IW into `rs1`, `rs2`, `rd` (5 bits each) | | **Immediate Decoder** | Extracts + sign-extends the immediate per format | | **ALUSrcB MUX** | Selects register value vs. immediate for ALU input B |
MUXes let
one circuit
run many different instructions by reconfiguring the datapath each cycle.
--- ## The Address-Conversion Problem The PC is 64 bits; the ROM has 8 address bits (256 entries). ```text PC ROM +----------+ 64 +-------+ 8 +-------------------+ 32 | 64-bit |---/---| ??? |--/---| 32-bit elements |--/--- IW | Register | +-------+ | 8 address bits | +----------+ (convert) | 2^8 = 256 entries | byte addr word addr +-------------------+ ``` **Task**: convert 64-bit **byte address** → 8-bit **word address** --- ## Byte Addresses vs. Word Addresses Instructions are 4 bytes wide, so consecutive instructions sit at byte addresses 0, 4, 8, 12, … ```text byte addr word index 12 | | --+ 13 | | +-- word 3 14 | | | 15 | | --+ 8 | | --+ 9 | | +-- word 2 ... 0 | | --+-- word 0 ``` **Conversion**: `addr_word = addr_byte >> 2` (divide by 4) | Byte addr | Binary | Word index | |-----------|--------------|------------| | 0 | `...000000` | 0 | | 4 | `...000100` | 1 | | 8 | `...001000` | 2 | | 12 | `...001100` | 3 | --- ## Three Address-Conversion Approaches **Approach 1 — Divider + splitter** ```text PC --64--> DIV(÷4) --64--> [bits 7-0] --8--> ROM ``` Works, but a full divider is overkill for a power-of-two divide. **Approach 2 — Barrel shifter + splitter** ```text PC --64--> SHR(>>2) --64--> [bits 7-0] --8--> ROM ``` Cheaper, but still more hardware than necessary. **Approach 3 (preferred) — One splitter** ```text PC --64--> [bits 9-2] --8--> ROM ``` No divider, no shifter — just wire the right bits directly. --- ## Why Bits 9–2? Right-shifting by 2 discards bits 1–0 (byte offset). The 8-bit word index is exactly **PC[9:2]**: ```text PC bit#: 10 9 8 7 6 5 4 3 2 1 0 \_______ 8-bit word addr _____/ XX (PC bits 9..2) (dropped) ```
This trick comes from Project05: word addressing = "select the right bit range." No arithmetic needed because the low bits are always
00
for 4-byte-aligned instructions.
--- ## Multi-ROM Instruction Memory For multiple programs, the lab wraps several ROMs in one sub-circuit:
flowchart LR PC["PC"] -->|"bits 9-2\n(8-bit addr)"| ADDR["ADDR\ndecoder"] PROG["PROG\n(program select)"] --> MUX["MUX"] ADDR --> ROM0["ROM 0"] ADDR --> ROM1["ROM 1"] ADDR --> ROM2["ROM 2"] ROM0 --> MUX ROM1 --> MUX ROM2 --> MUX MUX -->|"IW (32 bits)"| OUT["to datapath"]
- `PROG` input selects which ROM (which program) is active - Each ROM image is generated by `makerom3.py` --- ## The Register Decoder RISC-V places register fields in **fixed positions** across R-, I-, S-, B-type formats. | Field | Bits | Connects to | |-------|-------|----------------| | `rd` | 11–7 | WriteReg (WR) | | `rs1` | 19–15 | ReadReg0 (RR0) | | `rs2` | 24–20 | ReadReg1 (RR1) | One splitter does all three extractions: ```text [bits 11-7] ---> rd ---> RegFile WR IW (32) ---> [bits 19-15] ---> rs1 ---> RegFile RR0 [bits 24-20] ---> rs2 ---> RegFile RR1 ``` --- ## Register Decoder: Worked Example `add x5, x6, x7` (ABI: `add t0, t1, t2`) ```text rd = bits 11-7 = 00101 = 5 = x5 (t0) rs1 = bits 19-15 = 00110 = 6 = x6 (t1) rs2 = bits 24-20 = 00111 = 7 = x7 (t2) Register File: RR0 = 6 --> RD0 = value of t1 --+ RR1 = 7 --> RD1 = value of t2 --+--> ALU add --> result WR = 5 --> destination t0 <------/ (when RFW = 1) ```
The decoder wires all three fields unconditionally. The instruction decoder later decides (via
RFW
) whether the write actually happens.
--- ## RISC-V Immediate Formats The sign bit (bit 31) and source registers stay fixed; the rest is permuted per format: ```text R: | funct7 | rs2 | rs1 |fn3| rd | opcode | (no immediate) I: | imm[11:0] | rs1 |fn3| rd | opcode | S: | imm[11:5] | rs2 | rs1 |fn3|imm[4:0]| opcode | B: |imm[12|10:5]| rs2| rs1 |fn3|imm[4:1|11]| opcode | U: | imm[31:12] | rd | opcode | J: | imm[20|10:1|11|19:12] | rd | opcode | ```
Fixed sign bit and register fields let hardware start reading registers and sign-extending
before
the opcode is fully decoded.
--- ## Immediate Format Summary | Format | Used by | Bit fields in IW | |--------|---------|-----------------| | **I** | `addi`, `lw`, `jalr` | 31–20 = `imm[11:0]` | | **S** | `sw`, `sd` | 31–25 = `imm[11:5]`, 11–7 = `imm[4:0]` | | **B** | `beq`, `bne` | 31=`imm[12]`, 7=`imm[11]`, 30–25=`imm[10:5]`, 11–8=`imm[4:1]` | | **U** | `lui`, `auipc` | 31–12 = `imm[31:12]`, low 12 = 0 | | **J** | `jal` | 31=`imm[20]`, 19–12=`imm[19:12]`, 20=`imm[11]`, 30–21=`imm[10:1]` | B- and J-type have an implicit bit 0 = 0 (targets are 2-byte aligned). --- ## The Immediate Decoder (Interface) Takes 32-bit IW, outputs one 64-bit sign-extended immediate per format: ```text +--------------------+ IW 32 | IW imm_I |--/64 ----/--->| imm_S |--/64 | imm_B |--/64 | imm_U |--/64 | imm_J |--/64 +--------------------+ ImmDecoder ``` Only one format is needed per cycle — so we can unify with a MUX. --- ## Unified Immediate Decoder A single 64-bit output selected by `immSel` (3-bit control signal):
flowchart LR IW["IW (32 bits)"] --> I_dec["imm-I\ncircuit"] IW --> S_dec["imm-S\ncircuit"] IW --> B_dec["imm-B\ncircuit"] IW --> U_dec["imm-U\ncircuit"] IW --> J_dec["imm-J\ncircuit"] I_dec --> MUX["5-input\nMUX"] S_dec --> MUX B_dec --> MUX U_dec --> MUX J_dec --> MUX immSel["immSel\n(3 bits)"] --> MUX MUX -->|"imm (64 bits)"| OUT["ALUSrcB MUX"]
--- ## Constructing the S-Type Immediate S-type splits the 12-bit offset to keep `rs1`/`rs2` in fixed positions: ```text IW bits 11-7 --> imm[4:0] (low 5 bits) IW bits 31-25 --> imm[11:5] (high 7 bits) ``` **Steps**: 1. **Split** bits 11–7 → `imm[4:0]` 2. **Split** bits 31–25 → `imm[11:5]` 3. **Merge** into 12-bit value: `{imm[11:5], imm[4:0]}` 4. **Sign-extend** 12 bits → 64 bits ```text bits 11-7 ---+ +--> MERGE (12-bit) --> SignExt --> imm_S (64-bit) bits 31-25 ---+ ``` --- ## S-Type Example `sd a0, -8(sp)` needs offset = -8 ```text imm[11:5] = 1111111 (bits 31-25) imm[4:0] = 11000 (bits 11-7) Merged 12-bit: 1111111_11000 = 0b1111_1111_1000 Sign bit (bit 11) = 1 => negative --> sign-extend to 64 bits: 0xFFFF_FFFF_FFFF_FFF8 = -8 ```
I-type is simpler: one contiguous field (bits 31–20), no merge step needed.
--- ## The Sign Extender Replicate bit `n-1` (the sign bit) into all upper bits: ```text 12-bit: [ S | ............ ] bit 11 = sign 64-bit: [ S S...S | 12-bit lower ] 63 12 ``` **Hardware implementation — a MUX driven by the sign bit:** ```text imm[11] (sign bit) ----> MUX selector | 0 (52 zeros) --> [0] | MUX --+--> 52-bit upper fill -1 (52 ones) --> [1] | | [52-bit fill] ++ [12-bit imm] --> MERGE --> 64-bit result ``` --- ## Sign Extender Truth Table | Sign bit (bit 11) | Upper 52 bits | Result | |-------------------|---------------|--------| | 0 | `0000...0` (zeros) | Positive value preserved | | 1 | `1111...1` (ones) | Negative value preserved | **Example**: 12-bit `0b1111_1111_1100` = -4 - Sign bit = 1, MUX outputs 52 ones - Result: `0xFFFF_FFFF_FFFF_FFFC` = -4
Same pattern generalizes to any bit width — reused for sub-word loads (
lw
,
lb
).
--- ## Putting It All Together
flowchart LR IW["IW (32 bits)"] --> RD["RegDecoder"] IW --> ID["ImmDecoder"] IW --> IDEC["InstDecoder\n(next lecture)"] RD -->|rs1| RR0["RegFile RR0"] RD -->|rs2| RR1["RegFile RR1"] RD -->|rd| WR["RegFile WR"] ID -->|imm| MUXB["ALUSrcB\nMUX"] RR1 -->|RD1| MUXB IDEC -->|ALUSrcB| MUXB IDEC -->|RFW| WE["RegFile\nWriteEn"] IDEC -->|ALUOp| ALU["ALU"] RR0 -->|RD0| ALU MUXB --> ALU ALU -->|result| WD["RegFile\nWriteData"]
--- ## Trace: `addi a0, x0, 1` (I-type) ```text Instruction: addi a0, x0, 1 RegDecoder: rs1 = x0 --> RD0 = 0 rd = a0 --> WR = a0 ImmDecoder: imm-I = 1 (sign-extended) InstDecoder signals: ALUSrcB = 1 --> MUX picks immediate (1) ALUOp = ADD RFW = 1 --> write enabled ALU: 0 + 1 = 1 --> written to a0 ``` --- ## Trace: `add a2, a0, a1` (R-type) ```text Instruction: add a2, a0, a1 RegDecoder: rs1 = a0 --> RD0 = value of a0 rs2 = a1 --> RD1 = value of a1 rd = a2 --> WR = a2 ImmDecoder: (not used for R-type) InstDecoder signals: ALUSrcB = 0 --> MUX picks RD1 (register) ALUOp = ADD RFW = 1 --> write enabled ALU: a0 + a1 --> written to a2 ``` The same hardware runs both instructions — the MUX reconfigures the datapath. --- ## Address Conversion: Practice **PC = `0x2C` (44 decimal). What ROM address?** ```text 44 in binary: 0b0010_1100 Bits 9..2: 0 0 1 0 1 1 0 0 ^^-- bits 1-0 dropped (always 00) \___8-bit word addr___/ = 0b0000_1011 = 11 ROM word index = 11 ```
Common mistake: wiring PC[7:0] sends the
byte
address to a ROM that expects a
word
address — off by a factor of 4!
--- ## Key Concepts Reference | Concept | Definition | |---------|------------| | **Byte address** | Counts individual bytes; held in PC | | **Word address** | Counts 4-byte words; used by ROM | | **PC[9:2]** | 8-bit word address for 256-entry ROM | | **Splitter** | Extracts a bit range from a wire | | **Merger** | Concatenates bit ranges into one wire | | **Register Decoder** | One splitter: `rd`[11-7], `rs1`[19-15], `rs2`[24-20] | | **Immediate Decoder** | Split + merge + sign-extend per format | | **Sign extender** | MUX on sign bit: 0→zeros, 1→ones upper fill | | **immSel** | 3-bit control selecting which immediate format | | **ALUSrcB MUX** | Switches ALU B between register and immediate | --- ## Summary 1. **Lab 10 wires the instruction word to control everything** — the IW is fetched, decoded, and executed each clock cycle. 2. **PC byte address → ROM word address** via `addr_word = addr_byte >> 2` — elegantly implemented as one splitter taking **PC bits 9–2**. 3. **Register Decoder = one splitter** extracting `rd` (11–7), `rs1` (19–15), `rs2` (24–20) into the register file. 4. **RISC-V scatters immediate bits** across five formats (I, S, B, U, J), keeping the sign bit and register fields fixed. 5. **Each immediate format** is decoded with splitters + mergers + sign extension; S-type requires merging two pieces. 6. **Sign extension uses a MUX** on the sign bit, choosing all-zeros or all-ones for the upper fill. 7. **Unified Immediate Decoder** with `immSel` produces one 64-bit output, driving the ALUSrcB MUX. 8. **Next lecture**: instruction decoder that generates `RFW`, `ALUOp`, `ALUSrcB`, and `immSel` control signals.