Instruction ROM and Decoding¶

Overview¶

This lecture builds the front end of a hardware-based RISC-V static analyzer: a circuit that reads 32-bit machine instructions from a Read-Only Memory (ROM) and classifies each one by type without executing it. We cover how a ROM stores a program, how to bundle four ROMs behind a 2-bit program selector to build instruction memory, and the two combinational building blocks needed to classify instructions: the decoder (one input value selects one of many outputs) and the encoder / priority encoder (many inputs collapse to a small binary code). We finish by assembling these pieces into the analyze_decode circuit, which takes a 32-bit instruction word and emits a 3-bit instruction number (inum) naming its type. This is the core of Project 5 and the same decoding idea reappears in the Project 6 processor's InstDecoder.

Learning Objectives¶

Explain what a ROM is and compute its capacity from its address and data widths
Build instruction memory from multiple ROMs selected by a 2-bit program number
Convert a byte address (the PC) into a word address to index a ROM
Derive the sum-of-products equations for a 2-to-4 decoder from its truth table
Distinguish a decoder (1-of-n select) from an encoder (n-to-binary)
Explain why a plain encoder is undefined when multiple inputs are asserted, and how a priority encoder fixes this
Design the analyze_decode circuit using a splitter, equality comparators, and a priority encoder
Handle ambiguous instruction encodings (such as j vs. jal) by inspecting extra fields and ordering priorities correctly

Prerequisites¶

Combinational logic: AND/OR/NOT gates and sum-of-products design (Lab 08)
Multiplexers, comparators, decoders, and encoders (Lab 09)
Sequential logic: registers and counters (Lab 09)
RISC-V instruction formats and opcodes (R, I, S, B, J types) from Project 4 / Lab 03
Binary and hexadecimal, bit masking and shifting (Project 1, Project 3)
The Digital logic simulator (Lab 05 onward)

1. From Software Analysis to a Hardware Analyzer¶

In Project 4 we wrote a software static analyzer in C: a loop that walked over an array of 32-bit instruction words, inspected the bits of each one, and counted how many were I-type, R-type, loads, stores, branches, and so on. "Static" means we look at the instructions but never run them.

// Software static analyzer (sketch) - inspect, do not execute.
// instructions[] is the machine code; UNIMP marks the end.
#define UNIMP 0xC0001073u

void analyze(uint32_t instructions[]) {
    int counts[8] = {0};            // one counter per instruction type
    for (int i = 0; instructions[i] != UNIMP; i++) {
        uint32_t iw = instructions[i];
        int inum = analyze_decode(iw);   // classify: 0..7
        counts[inum]++;                  // bump the matching counter
    }
}

This lecture rebuilds that same idea in hardware for Project 5. The pieces map directly:

Software concept	Hardware component
`instructions[]` array	Instruction Memory (ROM)
`i` loop index	a counter driving the ROM address
`instructions[i] != UNIMP` test	an equality comparator against the sentinel that stops the clock
`analyze_decode(iw)`	the `analyze_decode` circuit (splitter + comparators + priority encoder)
`counts[inum]++`	a decoder enabling one of eight count registers, plus an adder

flowchart LR
    A[Program Selector PN] --> B[Instruction Memory ROMs]
    C[Address Counter] --> B
    B --> D["IW (32 bits)"]
    D --> E[analyze_decode]
    E --> F["inum (3 bits)"]
    F --> G[Decoder enables one counter]
    G --> H[Count Registers]
    D --> I{IW == unimp?}
    I -->|yes| J[stop clock]

    style E fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px

The crucial mental shift: software executes one statement at a time, but a circuit is all there at once. Every comparator is comparing, every gate is settling, on every clock tick. We are not writing a program; we are wiring up logic that is the analyzer.

2. ROM: Read-Only Memory¶

A ROM (Read-Only Memory) behaves like a small lookup table baked into hardware. You give it an address on its A input and it produces the stored value on its data output D. Unlike RAM, you cannot write to it while the circuit runs; its contents are loaded once (in Digital, from a .hex file).

        Address bits
            |
        A ──┤8       ┌──────────────┐
            │   ROM  │ [ 32-bit row ]│
            │        │ [ 32-bit row ]│
            │        │ [ 32-bit row ]│   D ──┤32──> IW
            │        │      ...      │
            └────────┤ [ 32-bit row ]│
                     └──────────────┘

Two numbers fully describe a ROM:

Address width — how many address lines A has. With an 8-bit address you can name 2^8 = 256 distinct rows (addresses 0 through 255).
Data width — how many bits each row holds. For RISC-V machine code each instruction is exactly 32 bits, so the ROM stores 32-bit words, and the output is our instruction word IW.

8-bit address  =>  2^8 = 256 rows
each row        =>  32 bits (one RISC-V instruction)
total capacity  =>  256 x 32 bits = 8192 bits = 1 KiB

Project 5 only needs to support programs of up to 256 instructions (including the end marker), which is exactly why an 8-bit address is enough.

Reading a ROM in Digital¶

In the simulator, put a value on A and the matching stored word appears on D immediately (combinational read — no clock needed). For example, if address 0 holds 0x0000000A and address 1 holds 0x0000EBDB, then driving A=0 shows 0x0000000A on the output, and changing to A=1 shows 0x0000EBDB. Each address selects a different 32-bit row.

Loading a ROM with a real program¶

We do not type instructions by hand. We extract them from a compiled object file and convert to the Digital .hex format:

# Disassemble the object file, keep only the hex instruction words,
# and emit a Digital-loadable .hex file.
objdump -d fib_rec_s.o | python3 makerom3.py > fib_rec_s_rom.hex

makerom3.py keeps only the 8-hex-digit instruction-word column from objdump output and prefixes a v2.0 raw header that Digital's ROM expects:

import sys

hexdigits = ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f']

print("v2.0 raw")

for line in sys.stdin:
    tokens = line.split()
    if len(tokens) < 2:
        continue
    if len(tokens[1]) != 8:        # keep only 8-hex-digit words
        continue
    if tokens[1][0] not in hexdigits:
        continue
    print(tokens[1])               # the 32-bit instruction word

The resulting .hex file is loaded into the ROM component via its Data attribute in Digital.

The end marker. A ROM read never "runs out" — so the analyzer needs a way to know when the program ends. We append the unimp instruction (0xC0001073) to the end of every program. The circuit compares each fetched IW to this sentinel; on a match it asserts a done signal that stops the clock.

3. Byte Addresses vs. Word Addresses¶

The ROM is indexed by word number (row 0, row 1, row 2, ...), but a real processor's Program Counter holds a byte address. RISC-V instructions are 4 bytes each, so consecutive instructions live at byte addresses 0, 4, 8, 12, ....

Byte address (PC):   0    4    8    12   16  ...
Word address (ROM):  0    1    2    3    4   ...

To turn a byte address into a word address, divide by 4 — which in hardware is just a right shift by 2, done with a splitter that drops the bottom two bits:

word_address = byte_address >> 2     (drop the low 2 bits)

In Project 5 the analyzer drives the ROM with a counter, and you can choose to count by words directly. The byte-vs-word distinction becomes essential in Project 6, where the real PC is a byte address and you must shift it before indexing instruction memory. Either way, the rule is the same: the low two bits of a RISC-V instruction address are always 0, so they carry no information for word-addressed memory.

4. Instruction Memory: Four ROMs and a Program Selector¶

A single ROM holds one program. Project 5 must analyze four test programs — fib_rec_s, is_pal_rec_s, get_bitseq_s, and quadratic_s — so we build an instruction memory subcircuit that holds all four and lets a 2-bit selector choose which one to read.

The subcircuit's interface looks like a single ROM with an extra select input:

  PN ──┤2 ─┐
            │   ┌──────────────────┐
ADDR ──┤8 ──┼──>│ Instruction Mem  │── IW ──┤32
            │   │   (4 ROMs +      │
            └──>│    selector MUX) │
                └──────────────────┘
   PN = Program Number  (0,1,2,3 picks one of four programs)

Inside, all four ROMs share the same ADDR input but each is loaded with a different program's .hex file. A 32-bit, 4-input multiplexer selects which ROM's output reaches IW, controlled by the 2-bit PN.

flowchart LR
    ADDR["ADDR (8)"] --> R0
    ADDR --> R1
    ADDR --> R2
    ADDR --> R3
    R0["ROM 0: fib_rec_s"] --> M{"MUX 4:1"}
    R1["ROM 1: is_pal_rec_s"] --> M
    R2["ROM 2: get_bitseq_s"] --> M
    R3["ROM 3: quadratic_s"] --> M
    PN["PN (2)"] --> M
    M --> IW["IW (32)"]

Truth table for the program selector:

PN	Selected ROM	Program
`00`	ROM 0	`fib_rec_s`
`01`	ROM 1	`is_pal_rec_s`
`10`	ROM 2	`get_bitseq_s`
`11`	ROM 3	`quadratic_s`

This is exactly the pattern from the handwritten notes: four ROM blocks, each fed the same 8-bit ADDR, each producing a 32-bit output, all funneled into one multiplexer whose select line is the 2-bit PN. The single 32-bit IW output then flows on to the decoder.

5. The Decoder: One Input Value, One Hot Output¶

A decoder takes a small binary input and asserts exactly one of many outputs — the one whose index equals the input value. It is the hardware version of "use this number to turn on line n." A 2-to-4 decoder takes a 2-bit select S = (S1, S0) and drives four outputs r0, r1, r2, r3, exactly one of which is high.

       ┌────────┐── r0
 S ──┤2│ 2->4   │── r1
       │decoder │── r2
       └────────┘── r3

Truth table¶

S1	S0	r3	r2	r1	r0
0	0	0	0	0	1
0	1	0	0	1	0
1	0	0	1	0	0
1	1	1	0	0	0

Each input combination lights up exactly one output — this is called a one-hot output.

Sum-of-products equations¶

Reading the rows where each output is 1 gives the decoder logic directly (using · for AND and an overbar for NOT):

r0 = (S1 · S0)      <- S1=0, S0=0
r1 = (S1 · S0)      <- S1=0, S0=1
r2 = (S1 · S0)      <- S1=1, S0=0
r3 = (S1 · S0)      <- S1=1, S0=1

Written more carefully with the inversions explicit:

r0 = NOT(S1) AND NOT(S0)
r1 = NOT(S1) AND     S0
r2 =     S1  AND NOT(S0)
r3 =     S1  AND     S0

Each output is one AND gate fed by the appropriate true or complemented select lines. This is sum-of-products in its simplest form — every output term is a single product, because exactly one row makes it 1.

Why the analyzer needs a decoder¶

In the counting circuit, the 3-bit inum (instruction type, 0..7) feeds a 3-to-8 decoder. The one hot output enables exactly one of the eight count registers, so only the counter for the current instruction's type gets incremented. With an enable input on the decoder (driven only when an instruction is valid), you gate the whole operation — the same "decoder with enable" idea used for the register file write port. The selected count register's value is routed through a multiplexer to an adder, incremented, and written back.

6. The Encoder: Many Inputs, One Binary Code¶

An encoder is the inverse of a decoder. It takes n input lines — of which (ideally) exactly one is high — and outputs the binary index of that line. A 4-to-2 encoder turns four one-hot inputs into a 2-bit number.

 a0 ──┤0 ┐
 a1 ──┤1 │ encoder │──┤2── d
 a2 ──┤2 │         │
 a3 ──┤3 ┘

Truth table (one-hot in, binary out)¶

a3	a2	a1	a0	d1	d0	value
0	0	0	1	0	0	0
0	0	1	0	0	1	1
0	1	0	0	1	0	2
1	0	0	0	1	1	3

If a2 = 1 and the rest are 0, the output is 10 (binary 2). The encoder names the active line.

The ambiguity problem¶

A plain encoder assumes exactly one input is high. What happens if two inputs are asserted at once — say a1 = 1 and a2 = 1? The truth table has no row for that combination, so the output is undefined (in practice it may produce garbage like the OR of the two encodings). For our instruction analyzer this is a real hazard: more than one comparator can fire for a single instruction, because some encodings overlap (we will see j vs. jal below). We need deterministic behavior.

7. The Priority Encoder¶

A priority encoder resolves the multiple-asserted-inputs problem by ranking the inputs. When several inputs are high, it outputs the index of the highest-priority asserted input (conventionally the highest-numbered one) and ignores the rest.

 1 ──┤   priority │
 0 ──┤   encoder  │──┤2── 2
 1 ──┤            │
 0 ──┤            │
   inputs (3 and 1 asserted)  ->  highest priority asserted = index 2

In the example above, inputs at index 1 and index 3 are... actually let us match the lecture's drawing precisely. With inputs (from index 3 down to 0) reading 0, 1, 0, 1, indices 0 and 2 are asserted; the priority encoder picks the highest asserted index, 2, and outputs 10. A plain encoder would have been undefined here.

This is exactly the property we exploit in analyze_decode:

Wire every instruction's "I matched" comparator output to a distinct priority-encoder input.
Assign a fixed inum to each input position.
Even if two comparators fire, the priority encoder yields a single, deterministic inum.

So the priority ordering is not cosmetic — it is how we break ties between overlapping instruction encodings. We will order the inputs so that the more specific instruction wins.

flowchart LR
    C0["opcode == i-type?"] --> P0["input 0"]
    C1["opcode == r-type?"] --> P1["input 1"]
    C2["...load?"] --> P2["input 2"]
    Cj["j detected?"] --> P6["input 6"]
    Cjal["jal detected?"] --> P5["input 5"]
    P0 --> PE[Priority Encoder]
    P1 --> PE
    P2 --> PE
    P5 --> PE
    P6 --> PE
    PE --> INUM["inum (3 bits)"]

8. The Instruction Decoder: `analyze_decode`¶

We now combine everything into the instruction decoder, named analyze_decode. It takes a 32-bit instruction word and outputs a 3-bit instruction number:

        ┌───────────────┐
 IW ──┤32│ analyze_decode│──┤3── inum
        └───────────────┘

The instruction-number (inum) mapping¶

The 3-bit inum encodes which of eight instruction categories the word belongs to:

inum	Type	Example	Notes
0	i-type	`addi`, `andi`	arithmetic/logic with immediate
1	r-type	`add`, `sub`, `mul`	register-register
2	load	`lw`, `ld`, `lb`	also an I-type opcode, special-cased
3	s-type	`sw`, `sd`, `sb`	store
4	b-type	`beq`, `bne`, `blt`, `bge`	conditional branch
5	jal	`jal ra, ...` (`call`)	jump and link
6	j	`j label`	unconditional jump
7	jalr	`jalr`, `ret`	jump and link register

How classification works¶

Split the instruction word. A splitter taps the opcode (bits 6:0, the lowest 7 bits) out of the 32-bit IW. Other fields (such as bits 11:7, the rd field) are tapped when needed to disambiguate.
Compare against constants. Each instruction type has a known 7-bit opcode. An equality comparator (eq) compares the extracted opcode against a hard-wired constant; its output is 1 when they match.
Feed the matches into a priority encoder. Each comparator output goes to a priority-encoder input whose index is the desired inum. The encoder collapses the (possibly overlapping) match signals into a single 3-bit answer.

            ┌──────── splitter ─────────┐
 IW ──┤32──>│  opcode = bits [6:0]      │── 7 ──┐
            └───────────────────────────┘       │
                                                 v
   constant 0b0010011 (i-type)  ──>  [ eq ] ──> priority encoder input 0
   constant 0b0110011 (r-type)  ──>  [ eq ] ──> priority encoder input 1
   ...                                              ...
                                                    |
                                                    v
                                              inum (3 bits)

The opcodes used (these are standard RISC-V):

Type	Opcode (7 bits)	Hex
i-type	`0010011`	`0x13`
load (I-type opcode)	`0000011`	`0x03`
r-type	`0110011`	`0x33`
s-type	`0100011`	`0x23`
b-type	`1100011`	`0x63`
jal	`1101111`	`0x6F`
jalr	`1100111`	`0x67`

Note that load and i-type share part of the same family (both are read with addi-style immediate logic) but have different opcodes (0x03 vs 0x13), so a comparator on the full 7-bit opcode separates them cleanly.

9. Special Cases: `j` vs `jal`, and the `unimp` Marker¶

Opcode alone is not always enough. The trickiest case is distinguishing the unconditional jump j from a call jal — because in RISC-V they share the same opcode (1101111). j is a pseudo-instruction; the assembler emits it as jal x0, offset. The difference is the destination register rd:

j label      ==  jal x0, label    (rd = x0  => discard return address)
jal ra, sub  ==  jal ra, sub      (rd = ra  => save PC+4 as return addr)

So we look at two fields:

the opcode (bits 6:0) tells us it is in the jal family, and
the rd field (bits 11:7) tells us which: rd == 0 means it is a plain j; rd != 0 means it is a real jal.

In hardware this is two comparators feeding an AND gate:

  opcode == jal-opcode  ──┐
                          AND ──> "this is a j"  (inum 6)
  rd     == 00000        ──┘

The same opcode match, but with rd != 0, identifies jal (inum 5). Because both conditions can momentarily look like "jal family," we rely on the priority encoder ordering and the extra rd test to land on the right inum. The software reference mirrors this: it checks for j before jal so the more specific case (rd == 0) wins.

// Order matters: test the more specific case (j) first.
if (opcode == JAL_OPCODE) {
    if (rd == 0)
        return 6;        // j   (jal x0, offset)
    else
        return 5;        // jal (saves return address in rd)
}

A similar two-field idea handles the unimp end marker (0xC0001073): the analyzer compares the entire 32-bit IW against the constant 0xC0001073. On a match it asserts done, which stops the clock so the counters freeze at their final values. unimp is deliberately an unimplemented encoding, so it can never be confused with a real instruction we want to count.

flowchart TD
    IW["IW (32 bits)"] --> OP["opcode = IW[6:0]"]
    IW --> RD["rd = IW[11:7]"]
    OP --> J1{"opcode == jal?"}
    RD --> J2{"rd == 0?"}
    J1 -->|yes| AND{AND}
    J2 -->|yes| AND
    AND -->|"j"| I6["inum = 6"]
    J1 -->|"yes, rd != 0"| I5["inum = 5 (jal)"]
    IW --> U{"IW == 0xC0001073?"}
    U -->|yes| DONE["done = 1, stop clock"]

10. Putting It Together: The Counting Datapath¶

With instruction memory feeding analyze_decode, the full Project 5 analyzer is a small sequential circuit. Here is the per-clock flow described in the lecture:

The user selects a program with PROG (the 2-bit PN); a CLR pulse initializes the address counter and all eight count registers to zero.
On each clock tick, the counter advances the address to fetch the next instruction word from instruction memory.
The fetched IW is compared to the unimp sentinel. If equal, done goes high and the clock is stopped — counting is finished.
Otherwise, analyze_decode produces the 3-bit inum.
A 3-to-8 decoder turns inum into a one hot enable, selecting exactly one of the eight count registers.
A multiplexer routes that register's current value to an adder (+1); the result is written back to the same register on the clock edge.

flowchart LR
    PROG["PROG (2)"] --> IMEM[Instruction Memory]
    CTR[Address Counter] --> IMEM
    IMEM --> IW["IW (32)"]
    IW --> SENT{"== unimp?"}
    SENT -->|yes| STOP[done -> stop CLK]
    IW --> AD[analyze_decode]
    AD --> INUM["inum (3)"]
    INUM --> DEC[3-to-8 Decoder]
    DEC --> REGS["8 count registers"]
    REGS --> MUXSEL[MUX selects current count]
    MUXSEL --> ADDER["+1"]
    ADDER --> REGS
    CLK[CLK] --> CTR
    CLK --> REGS

The eight registers correspond exactly to the Project 5 outputs: ITYPE, RTYPE, LOAD, STYPE, BTYPE, JAL, J, and JALR, plus a TOTAL counter that increments on every (non-sentinel) instruction. This is the hardware realization of the C counts[inum]++ loop from Section 1.

Key Concepts¶

Concept	Definition	Example
ROM	Read-only memory; address in, stored data out, cannot be written at runtime	8-bit `A`, 32-bit `D` holds 256 instructions
Address width	Number of address lines; sets row count `2^n`	8 bits → 256 rows
Instruction word (IW)	A single 32-bit RISC-V machine instruction	`0x00A00513` (`addi a0,x0,10`)
Word vs byte address	ROM indexed by word; PC counts bytes; divide by 4	`addr_word = addr_byte >> 2`
Program selector (PN)	2-bit input choosing one of four ROMs via a MUX	`PN=10` selects `get_bitseq_s`
Decoder	Binary input asserts exactly one (one-hot) output	2-to-4: `S=10` → `r2=1`
Encoder	One-hot input produces its binary index	`a2=1` → output `10` (2)
Priority encoder	Encoder that outputs the highest-priority asserted input	inputs `0101` → output `2`
analyze_decode	Circuit mapping a 32-bit IW to a 3-bit instruction type `inum`	`add` → `inum=1`
inum	3-bit code naming an instruction type (0..7)	`5 = jal`, `6 = j`
Sentinel (unimp)	End-of-program marker that stops the clock	`0xC0001073`
opcode	Low 7 bits of IW selecting the instruction family	`0010011` = i-type

Practice Problems¶

Problem 1: ROM Capacity¶

A ROM has a 10-bit address input and a 32-bit data output. How many instructions can it hold, and what is its total capacity in bytes?

Click to reveal solution

Rows         = 2^10 = 1024 instructions
Bits per row = 32
Total bits   = 1024 x 32 = 32768 bits
Total bytes  = 32768 / 8 = 4096 bytes = 4 KiB

A 10-bit address names 1024 distinct rows, each holding one 32-bit RISC-V instruction, for 4 KiB of program storage.

Problem 2: Byte to Word Address¶

The PC holds the byte address 0x0000002C. What ROM (word) address should index instruction memory, and which instruction number is that in the program?

Click to reveal solution

byte address = 0x2C = 44 (decimal)
word address = 44 >> 2 = 44 / 4 = 11

So the PC points at the **12th** instruction (index 11, counting from 0). In hardware, drop the low two bits with a splitter: `0x2C = 0b101100`; removing the bottom two bits gives `0b1011 = 11`. Because RISC-V instruction addresses are always multiples of 4, those two bits are guaranteed to be `00`.

Problem 3: Decoder Equations¶

Write the sum-of-products equations for the third output (r2) of a 2-to-4 decoder, and state for which input (S1, S0) it is high.

Click to reveal solution

r2 = S1 · NOT(S0)        (S1 = 1, S0 = 0)

`r2` is high only for input `S = 10` (binary 2). It is a single AND gate fed by `S1` (true) and `S0` (complemented). This matches the decoder truth table: each output corresponds to exactly one input combination, so each output is one product term.

Problem 4: Plain Encoder vs Priority Encoder¶

A 4-to-2 encoder receives inputs a3 a2 a1 a0 = 0110 (both a1 and a2 asserted). What does a plain encoder produce, and what does a priority encoder produce?

Click to reveal solution

Plain encoder:    UNDEFINED.
  The truth table has no row for two asserted inputs.
  A typical OR-based implementation would output the bitwise OR
  of the two valid encodings: encode(2)=10 OR encode(1)=01 = 11 (=3),
  which is wrong - index 3 was never even asserted.

Priority encoder: 2 (binary 10).
  It picks the highest-priority (highest-numbered) asserted input,
  which is a2 at index 2, and ignores a1.

This is exactly why `analyze_decode` uses a **priority** encoder: overlapping comparator matches must resolve to one deterministic `inum`.

Problem 5: Classify Instructions¶

For each instruction word, give the opcode (bits 6:0) and the inum:

(a) add a2, a0, a1 → 0x00B50633
(b) j label → opcode 1101111, rd = x0
(c) jal ra, sub → opcode 1101111, rd = ra (x1)

Click to reveal solution

(a) 0x00B50633 = 0000 0000 1011 0101 0000 0110 0011 0011
    opcode = bits[6:0] = 0110011  -> r-type
    inum = 1

(b) opcode = 1101111 (jal family), rd = 00000 (x0)
    rd == 0  =>  this is a plain j
    inum = 6

(c) opcode = 1101111 (jal family), rd = 00001 (x1 = ra)
    rd != 0  =>  this is a real jal (call)
    inum = 5

(b) and (c) share an opcode; only the `rd` field distinguishes them. The decoder must test `rd == 0` and prioritize `j` over `jal`.

Problem 6: Why a Sentinel?¶

A ROM always returns some 32-bit value for every address, even past the end of your program. Why does the analyzer need a unimp sentinel, and what would happen without one?

Click to reveal solution

A ROM has no concept of "end of data." Addresses beyond the last real
instruction return whatever was loaded there (often 0x00000000, which
decodes as a real instruction!). Without a sentinel, the counter would
keep advancing and the analyzer would count garbage rows as instructions,
inflating the totals and never stopping.

The unimp marker (0xC0001073) is an intentionally unimplemented encoding.
The circuit compares each fetched IW to this constant; on a match it
asserts `done`, which stops the clock so the counters hold their final
values. unimp can never be mistaken for a real instruction we want to
count, so the boundary is unambiguous.

Summary¶

A ROM is a hardware lookup table: an n-bit address selects one of 2^n stored rows. With an 8-bit address and 32-bit rows, one ROM holds 256 RISC-V instructions — exactly Project 5's limit.
Programs are loaded from object files by running objdump through makerom3.py to produce a Digital .hex file; an appended unimp (0xC0001073) marks the end so the circuit knows when to stop.
The ROM is word-addressed but the PC is byte-addressed, so divide the byte address by 4 (right shift by 2, dropping the always-zero low two bits) to get the word index.
Instruction memory bundles four ROMs behind a 4-input multiplexer, with a 2-bit program number PN selecting which test program's IW reaches the decoder.
A decoder turns a binary value into a one-hot output (one output high), used to enable exactly one of the eight type-count registers; its sum-of-products equations come straight from the truth table.
An encoder is the inverse of a decoder, but a plain encoder is undefined when multiple inputs fire — so we use a priority encoder, which deterministically reports the highest-priority asserted input.
analyze_decode classifies an instruction by splitting out the opcode, comparing it against known constants with equality comparators, and feeding the matches into a priority encoder that emits a 3-bit inum (0..7).
Some encodings overlap — j and jal share an opcode — so the decoder also inspects the rd field and orders priorities to pick the more specific case (j when rd == 0), the same way the C reference checks j before jal.

Instruction ROM and Decoding¶

Overview¶

Learning Objectives¶

Prerequisites¶

1. From Software Analysis to a Hardware Analyzer¶

2. ROM: Read-Only Memory¶

Reading a ROM in Digital¶

Loading a ROM with a real program¶

3. Byte Addresses vs. Word Addresses¶

4. Instruction Memory: Four ROMs and a Program Selector¶

5. The Decoder: One Input Value, One Hot Output¶

Truth table¶

Sum-of-products equations¶

Why the analyzer needs a decoder¶

6. The Encoder: Many Inputs, One Binary Code¶

Truth table (one-hot in, binary out)¶

The ambiguity problem¶

7. The Priority Encoder¶

8. The Instruction Decoder: analyze_decode¶

The instruction-number (inum) mapping¶

How classification works¶

9. Special Cases: j vs jal, and the unimp Marker¶

10. Putting It Together: The Counting Datapath¶

Key Concepts¶

Practice Problems¶

Problem 1: ROM Capacity¶

Problem 2: Byte to Word Address¶

Problem 3: Decoder Equations¶

Problem 4: Plain Encoder vs Priority Encoder¶

Problem 5: Classify Instructions¶

Problem 6: Why a Sentinel?¶

Further Reading¶

Summary¶

8. The Instruction Decoder: `analyze_decode`¶

9. Special Cases: `j` vs `jal`, and the `unimp` Marker¶