← Back to Course
# Instruction ROM and Decoding ## CS 315 Computer Architecture --- ## What We Are Building Hardware **static analyzer**: reads 32-bit RISC-V instructions from ROM and classifies each one — without executing it.
flowchart LR A["Program Selector PN"] --> B["Instruction Memory"] C["Address Counter"] --> B B --> D["IW (32 bits)"] D --> E["analyze_decode"] E --> F["inum (3 bits)"] F --> G["3-to-8 Decoder"] G --> H["Count Registers"] D --> I{"IW == unimp?"} I -->|yes| J["stop clock"]
--- ## Software to Hardware Mapping | Software concept | Hardware component | |------------------|--------------------| | `instructions[]` array | Instruction Memory (ROM) | | Loop index `i` | Address Counter | | `!= UNIMP` test | Equality comparator | | `analyze_decode(iw)` | `analyze_decode` circuit | | `counts[inum]++` | Decoder + adder + registers |
A circuit is
all there at once
— every comparator is comparing on every clock tick.
--- ## ROM: Read-Only Memory A **ROM** is a hardware lookup table: address in, stored word out. ```text Address width: 8 bits -> 2^8 = 256 rows Data width: 32 bits -> one RISC-V instruction per row Total: 256 x 32 = 8192 bits = 1 KiB ```
**Key properties:** - Combinational read (no clock) - Contents loaded once from `.hex` - Cannot be written at runtime
**In Digital simulator:** - Drive `A` → `D` appears immediately - Load via **Data** attribute (`.hex` file)
--- ## Loading a ROM: `makerom3.py` ```bash objdump -d fib_rec_s.o | python3 makerom3.py > fib_rec_s_rom.hex ``` ```python print("v2.0 raw") for line in sys.stdin: tokens = line.split() if len(tokens) < 2: continue if len(tokens[1]) != 8: continue # keep 8-hex-digit words if tokens[1][0] not in hexdigits: continue print(tokens[1]) # 32-bit instruction word ```
End marker:
append
unimp
(
0xC0001073
) to every program. The circuit stops the clock when it sees this sentinel.
--- ## Byte Addresses vs. Word Addresses The ROM is **word-addressed**; the PC holds a **byte address**. ```text Byte address (PC): 0 4 8 12 16 ... Word address (ROM): 0 1 2 3 4 ... ``` In hardware: **right-shift by 2** (drop the always-zero low 2 bits). ```text word_address = byte_address >> 2 ```
In Project 5 the counter counts words directly.
In Project 6 the real byte-addressed PC must be shifted before indexing ROM.
--- ## Instruction Memory: Four ROMs Four test programs, one 2-bit selector `PN` chooses which `IW` reaches the decoder.
flowchart LR ADDR["ADDR (8)"] --> R0["ROM 0: fib_rec_s"] ADDR --> R1["ROM 1: is_pal_rec_s"] ADDR --> R2["ROM 2: get_bitseq_s"] ADDR --> R3["ROM 3: quadratic_s"] R0 --> M{"MUX 4:1"} R1 --> M R2 --> M R3 --> M PN["PN (2)"] --> M M --> IW["IW (32)"]
--- ## Program Selector Truth Table | PN | Selected ROM | Program | |----|--------------|---------| | `00` | ROM 0 | `fib_rec_s` | | `01` | ROM 1 | `is_pal_rec_s` | | `10` | ROM 2 | `get_bitseq_s` | | `11` | ROM 3 | `quadratic_s` | All four ROMs share the same `ADDR` input. Only the selected ROM's output passes through the 32-bit 4-input MUX. --- ## The Decoder: One-Hot Output A **2-to-4 decoder** takes a 2-bit select and asserts exactly **one** output. | S1 | S0 | r3 | r2 | r1 | r0 | |----|----|----|----|----|----| | 0 | 0 | 0 | 0 | 0 | **1** | | 0 | 1 | 0 | 0 | **1** | 0 | | 1 | 0 | 0 | **1** | 0 | 0 | | 1 | 1 | **1** | 0 | 0 | 0 | --- ## Decoder: Sum-of-Products Equations Reading each output's row directly from the truth table: ```text r0 = NOT(S1) AND NOT(S0) <- input 00 r1 = NOT(S1) AND S0 <- input 01 r2 = S1 AND NOT(S0) <- input 10 r3 = S1 AND S0 <- input 11 ``` Each output is **one AND gate** driven by true or complemented select lines — sum-of-products at its simplest.
In Project 5 a
3-to-8 decoder
turns
inum
into a one-hot enable that selects which count register to increment.
--- ## The Encoder: Binary Index of Active Input An **encoder** is the inverse of a decoder: one-hot input → binary index. | a3 | a2 | a1 | a0 | d1 | d0 | |----|----|----|----|----|----| | 0 | 0 | 0 | 1 | 0 | 0 | | 0 | 0 | 1 | 0 | 0 | 1 | | 0 | 1 | 0 | 0 | 1 | 0 | | 1 | 0 | 0 | 0 | 1 | 1 | If `a2 = 1` (all others 0), output = `10` (binary 2). --- ## The Ambiguity Problem A plain encoder assumes **exactly one** input is high.
What if
a1 = 1
and
a2 = 1
simultaneously?
The truth table has no matching row — output is
undefined
.
In our instruction analyzer, **more than one comparator can fire** for the same instruction word (overlapping encodings). We need a **priority encoder**. --- ## The Priority Encoder Resolves ties by reporting the **highest-priority (highest-numbered) asserted input**. ```text Inputs (index 3 down to 0): 0 1 0 1 ^ ^ index 2 and index 0 asserted Priority encoder output: 10 (= 2, the highest asserted index) ```
The priority ordering is how we
break ties
between overlapping encodings — not just a tiebreaker, it determines correctness.
--- ## `analyze_decode`: Overview Takes a 32-bit instruction word, emits a 3-bit instruction type `inum`. ```text ┌───────────────┐ IW ──32──>│ analyze_decode│──3──> inum └───────────────┘ ``` **Three steps:** 1. **Split** — extract opcode (bits `6:0`) and `rd` (bits `11:7`) 2. **Compare** — equality comparator per instruction type 3. **Encode** — priority encoder collapses matches to `inum` --- ## `inum` Mapping | inum | Type | Opcode (7 bits) | Hex | |------|------|-----------------|-----| | 0 | i-type | `0010011` | `0x13` | | 1 | r-type | `0110011` | `0x33` | | 2 | load | `0000011` | `0x03` | | 3 | s-type | `0100011` | `0x23` | | 4 | b-type | `1100011` | `0x63` | | 5 | jal | `1101111` | `0x6F` | | 6 | j | `1101111` | `0x6F` | | 7 | jalr | `1100111` | `0x67` | --- ## `analyze_decode`: Circuit Structure
flowchart LR IW["IW (32)"] --> SP["Splitter"] SP --> OP["opcode bits 6:0"] SP --> RD["rd bits 11:7"] OP --> C0["== 0x13?"] OP --> C1["== 0x33?"] OP --> C2["== 0x03?"] OP --> C3["== 0x23?"] OP --> C4["== 0x63?"] OP --> C5["== 0x6F?"] RD --> C6["== 0?"] C5 --> AND1["AND"] C6 --> AND1 C0 --> PE["Priority Encoder"] C1 --> PE C2 --> PE C3 --> PE C4 --> PE C5 --> PE AND1 --> PE PE --> INUM["inum (3)"]
--- ## Opcodes and Their Families
Load vs. I-type:
both use immediate arithmetic logic, but they have
different
opcodes (
0x03
vs
0x13
), so a full 7-bit comparator separates them cleanly.
```text i-type addi, andi, ori -> opcode 0010011 (0x13) load lw, ld, lb -> opcode 0000011 (0x03) <- different! r-type add, sub, mul -> opcode 0110011 (0x33) s-type sw, sd, sb -> opcode 0100011 (0x23) b-type beq, bne, blt -> opcode 1100011 (0x63) jal/j jal, j -> opcode 1101111 (0x6F) <- SAME! jalr jalr, ret -> opcode 1100111 (0x67) ``` --- ## Special Case: `j` vs. `jal` `j` and `jal` share opcode `0x6F`. The difference is in **`rd`**: ```text j label == jal x0, label (rd = 0 = x0) jal ra, sub == jal ra, sub (rd = 1 = ra) ``` In hardware: ```text opcode == 0x6F ──┐ AND ──> "this is a j" (inum 6) rd == 00000 ──┘ ``` Same opcode but `rd != 0` identifies a real `jal` (inum 5). --- ## j vs. jal: Priority Ordering Matters ```c // Software reference — order matters! if (opcode == JAL_OPCODE) { if (rd == 0) return 6; // j (jal x0, offset) else return 5; // jal (saves return address) } ```
The
more specific case
(
j
, with
rd == 0
) must win.
The priority encoder ordering enforces this — assign
j
to a higher-priority input than
jal
.
--- ## Disambiguating j vs. jal in Hardware
flowchart TD IW["IW (32 bits)"] --> OP["opcode = IW[6:0]"] IW --> RD["rd = IW[11:7]"] OP --> J1{"opcode == 0x6F?"} RD --> J2{"rd == 0?"} J1 -->|yes| AND{"AND"} J2 -->|yes| AND AND -->|"both true"| I6["inum = 6 (j)"] J1 -->|"yes but rd != 0"| I5["inum = 5 (jal)"]
--- ## The `unimp` Sentinel ```text unimp = 0xC0001073 (intentionally unimplemented encoding) ``` - The circuit compares the **full 32-bit `IW`** against this constant - Match asserts `done` which stops the clock - Counters freeze at their final values
A ROM always returns
some
value for every address. Without a sentinel, the counter would keep advancing and count garbage rows forever.
--- ## The Counting Datapath
flowchart LR PROG["PROG (2)"] --> IMEM["Instruction Memory"] CTR["Address Counter"] --> IMEM IMEM --> IW["IW (32)"] IW --> SENT{"== unimp?"} SENT -->|yes| STOP["done -> stop CLK"] IW --> AD["analyze_decode"] AD --> INUM["inum (3)"] INUM --> DEC["3-to-8 Decoder"] DEC --> REGS["8 count registers"] REGS --> MUX["MUX selects current count"] MUX --> ADD["+1"] ADD --> REGS
--- ## Per-Clock Sequence 1. `CLR` initializes counter and all eight registers to zero 2. On each **clock tick**: counter advances, ROM fetches next `IW` 3. `IW == unimp?` — if yes, `done` stops the clock 4. Otherwise: `analyze_decode` produces 3-bit `inum` 5. **3-to-8 decoder** asserts one enable line (one-hot) 6. **MUX** routes that register's value to adder; `+1` written back This is the hardware realization of `counts[inum]++`. --- ## Project 5 Outputs | Register | `inum` | Instruction type | |----------|--------|-----------------| | `ITYPE` | 0 | i-type arithmetic | | `RTYPE` | 1 | r-type register-register | | `LOAD` | 2 | load | | `STYPE` | 3 | store | | `BTYPE` | 4 | branch | | `JAL` | 5 | jal (call) | | `J` | 6 | j (unconditional jump) | | `JALR` | 7 | jalr / ret | Plus `TOTAL` increments on every non-sentinel instruction. --- ## Key Concepts | Concept | One-line definition | |---------|---------------------| | **ROM** | Address in, stored word out; combinational; loaded once | | **Address width** | `n` bits → `2^n` rows | | **Word vs byte addr** | `word = byte >> 2` (drop low 2 bits) | | **Decoder** | Binary input → one-hot output | | **Encoder** | One-hot input → binary index | | **Priority encoder** | Picks highest-priority asserted input | | **inum** | 3-bit code for instruction type (0..7) | | **Sentinel** | `0xC0001073` marks end-of-program | --- ## ROM Capacity Quick Reference ```text Address bits (n) | Rows | Total (32-bit words) ------------------+----------+----------------------- 8 | 256 | 1 KiB 10 | 1024 | 4 KiB 12 | 4096 | 16 KiB 16 | 65536 | 256 KiB ``` **Formula:** `rows = 2^n`, `total bytes = rows × 4` --- ## Practice: Classify These Instructions | Instruction | Opcode bits | `rd` | `inum` | |-------------|-------------|------|--------| | `addi a0, x0, 10` | `0010011` | — | **0** | | `add a2, a0, a1` | `0110011` | — | **1** | | `lw a0, 0(sp)` | `0000011` | — | **2** | | `j label` | `1101111` | `00000` | **6** | | `jal ra, sub` | `1101111` | `00001` | **5** | | `jalr x0, 0(ra)` | `1100111` | — | **7** | --- ## Summary 1. A **ROM** stores a program as 32-bit rows indexed by word address; 8-bit address → 256 instructions (1 KiB) 2. **Programs are loaded** via `objdump` + `makerom3.py`; a `unimp` sentinel marks the end 3. **Byte addresses become word addresses** by right-shifting 2 bits (dropping the always-zero low bits) 4. **Instruction memory** = four ROMs + 4-input MUX selected by 2-bit `PN` 5. **Decoder** (binary → one-hot) enables exactly one count register; **encoder** (one-hot → binary) collapses comparator matches to `inum` 6. **Priority encoder** resolves overlapping matches; ordering determines which wins (`j` beats `jal` when `rd == 0`) 7. **`analyze_decode`** = splitter + comparators + priority encoder → 3-bit `inum`