Skip to content

Lab: Data Memory — LW and SW

Overview

This hands-on lab session adds data memory to the single-cycle RISC-V processor we have been building across Lab 10 and Project 6. We use a Digital RAM component configured with 64-bit words so that ld/sd (load/store doubleword) work almost for free. We then wrap the RAM with load logic (after the RAM) and store logic (before the RAM) so the same 64-bit memory can also service lw/sw (word) and, by the same pattern, lb/sb (byte). The central idea is address arithmetic: the ALU produces a 64-bit byte address, but the RAM is indexed by doubleword address, so we shift right by 3, then use the low address bits to pick which sub-word inside a 64-bit cell we are touching. This page walks through the datapath, the memory-layout intuition, the control lines (MLD, MST, MSZ, M2R), and the step-by-step circuit construction with worked examples and common mistakes.

Learning Objectives

  • Configure the Digital RAM (Separated Ports) component for use as processor data memory and state its capacity.
  • Convert a 64-bit byte address (from the ALU) into the doubleword address the RAM expects.
  • Identify which low address bits select a word vs. a byte within a 64-bit memory cell.
  • Wire the ld/sd datapath: ALU → address splitter → RAM, and RAM Dout → register file write-back MUX.
  • Build the load logic for lw: split the 64-bit Dout into two 32-bit halves, select with address bit 2, and sign-extend to 64 bits.
  • Build the store logic for sw: read-modify-write a 64-bit cell so only the targeted 32-bit half is updated.
  • Add and set the new control lines (MLD, MST, MSZ, M2R) in the instruction decoder spreadsheet.
  • Recognize and avoid common wiring/control mistakes (wrong shift amount, misaligned selector bit, forgetting str+ld together for stores).

Prerequisites

  • A working Project 6 processor through JAL/JALR and conditional branches.
  • RISC-V instruction formats, especially I-type (loads) and S-type (stores) (Project 4 / Lab 03).
  • Byte addressing, words vs. doublewords, little-endian layout (Project 3).
  • Sign extension, shifting, and masking (Project 3, Lab 09).
  • Multiplexers, splitters, mergers, and the RAM component in Digital (Lab 09, Project 5).
  • The register-file write-back path and WDsel MUX from Lab 10 / Project 6.

1. Where Data Memory Fits

So far our processor can fetch instructions, decode them, read/write registers, run the ALU, and change control flow with jumps and branches. The last missing capability is touching memory: loading values from and storing values to RAM. Our test programs use this for the stack (saving ra and locals), but the same memory could back a heap.

RISC-V memory instructions come in matched load/store pairs, organized by data size:

Mnemonic Meaning Size Format Action
ld load doubleword 64-bit I-type rd = mem[rs1 + imm]
sd store doubleword 64-bit S-type mem[rs1 + imm] = rs2
lw load word 32-bit I-type rd = sext(mem32[rs1 + imm])
sw store word 32-bit S-type mem32[rs1 + imm] = rs2[31:0]
lb load byte 8-bit I-type rd = sext(mem8[rs1 + imm])
sb store byte 8-bit S-type mem8[rs1 + imm] = rs2[7:0]

Lab strategy (incremental): get ld/sd working first with a plain 64-bit RAM, then add the wrapper logic for lw/sw. Once lw/sw work, lb/sb follow the exact same pattern with a narrower slice. Do not try to build everything at once.

flowchart LR
    A["Register File<br/>RD0 = base addr"] --> B["ALU<br/>TA = base + imm"]
    I["ImmDecoder<br/>imm-i / imm-s"] --> B
    B --> C["Addr splitter<br/>byte to DW addr"]
    C --> D["RAM<br/>64-bit cells"]
    R1["RD1 = store value"] --> D
    D --> E["Load logic<br/>select + sign-extend"]
    E --> F["Write-back MUX<br/>M2R / WDsel"]
    F --> A

    style D fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:2px

2. The Digital RAM Component

We use Digital's RAM, Separated Ports component. "Separated ports" means it has distinct inputs for the address, the data to write (Din), and a separate data output (Dout) — exactly what a processor wants.

Configure it as:

Setting Value Why
Data bits 64 One cell holds a full RISC-V doubleword, so ld/sd are trivial.
Addr bits 7 2^7 = 128 cells × 8 bytes = 1024 bytes of data memory.

The component's pins, as drawn in the lab notes:

            +-----------------------+
   A  --/-->| A (addr)         Dout |--/--> 64 bits out
   7        |                       |   64
            |                       |
 Din --/--> | Din               RAM |
   64       |                       |
 MST -----> | str                   |
            |                       |
 CLK -----> | clk                   |
            |                       |
 MLD -----> | ld                    |
            +-----------------------+
            data bits: 64
            addr bits: 7
Pin Driven by Meaning
A address splitter output doubleword address (7 bits) of the cell to access
Din store logic 64-bit value to write
Dout (output) 64-bit value read from cell A
str MST control line when 1, write Din into cell A on the clock edge
clk processor CLK synchronous write timing
ld MLD control line when 1, drive Dout with the contents of cell A

Keep the RAM at the top level. A critical lab tip: even though you will wrap the RAM with extra logic, leave the RAM component itself at the top level of your processor circuit. That is the only way Digital lets you open the RAM during simulation to inspect (and pre-load) its contents while debugging. Put the load logic after it and the store logic before it, but do not bury the RAM inside a sub-circuit.


3. Byte Addresses vs. Doubleword Addresses

This is the single most important idea in the lab, and it caused the most confusion in class.

Registers always hold byte addresses. When a program computes sp + 8, that is a byte address. But our RAM is an array of 64-bit (8-byte) cells, and its A input selects a cell, not a byte. So we must convert.

A byte address divided by 8 gives the doubleword (DW) address:

DW address = byte address / 8 = byte address >> 3

Dividing by 8 just drops the low 3 bits. In hardware we do not need a divider — we use a splitter to take bits [9:3] of the 64-bit ALU output and feed those 7 bits to the RAM A input. The low 3 bits [2:0] are the offset inside the cell and are not sent to the RAM address; instead they tell us which sub-word we are touching.

 byte addr bit:  ... 9  8  7  6  5  4  3 | 2 | 1  0
                 \________ DW addr ______/ |word|byte|
                        (-> RAM A, 7 bits)  |sel | sel |

The notes drew this as the ALU output TA (64 bits, a byte address) entering a splitter labeled 3-9, whose 7-bit output drives the RAM A input.

                  +-----+
  RD0 (rs1) ----->|  A  |
  base addr       | ALU |  TA  +--------+      A
                  |  B  |--/--->| 3 - 9  |--/-->  (RAM addr)
  imm-i/imm-s --->+-----+  64   +--------+  7
                          byte    splitter   DW addr
                          addr

The three address "views"

The class drew the same memory three ways at once. The picture below mirrors that "memory layout" diagram: the same storage seen as raw bytes, as 32-bit words, and as 64-bit doublewords.

 byte addr   word addr   dword addr
 ----------  ---------   ----------
   ...
    16  --*    (word 4)    (dword 2)
    15
    14   } word 3 ----,
    13                 } dword 1
    12  ----------'
    11
    10   } word 2 ---,
     9                } dword 1
     8  -----------'
     7
     6   } word 1 ---,
     5                } dword 0
     4  -----------'
     3
     2   } word 0 ---,
     1                } dword 0
     0  -----------'

Reading the relationships off the diagram:

Address kind Conversion from byte address Low bits used as offset
byte (itself)
word byte >> 2 bits [1:0] = byte within word
doubleword byte >> 3 bits [2:0] = byte within dword

Selector bits, color-coded as in the notes

The handwritten layout used colored boxes to show how the low address bits act as selectors:

  • Word selector (green) = byte address bit 2. Within one 64-bit cell there are two 32-bit words: bit 2 = 0 selects the lower word (bytes 0–3), bit 2 = 1 selects the upper word (bytes 4–7).
  • Byte selector (yellow) = byte address bits [1:0]. Within one 32-bit word there are four bytes; bits [1:0] pick which.
  64-bit cell  =  [ byte7 byte6 byte5 byte4 | byte3 byte2 byte1 byte0 ]
                  \________ upper word _____/ \_______ lower word ____/
                          (bit 2 = 1)                 (bit 2 = 0)
                  \__ b[1:0] pick byte __/   \__ b[1:0] pick byte __/

Common mistake. Students shifted by 2 (word divide) instead of 3 (doubleword divide) when feeding the RAM A input. Because the RAM holds 64-bit cells, the address must be a doubleword address — shift right by 3, take bits [9:3].


4. Step 1 — Implementing ld / sd

Start with the easy case: 64-bit loads and stores. No sub-word logic is needed because the RAM cell is a doubleword.

Datapath

  RD0 (rs1) --+
  base addr   |   +-----+
              +-->|  A  |  TA   +-------+   A    +---------+ Dout  +-----------+
                  | ALU |--/--->| 3 - 9 |--/---->|   RAM   |--/--->| WDsel MUX |--> RegFile WD
  imm-i  ------->|  B  |  64    +-------+   7    |  64-bit |  64   +-----------+
  (ld, I-type)   +-----+                        |  cells  |
  imm-s  -------> (for sd, B = imm-s)            |         |
                                       RD1 ----> | Din     |
                                  (sd store val) |         |
                                       MST ----> | str     |
                                       CLK ----> | clk     |
                                       MLD ----> | ld      |
                                                 +---------+

Step-by-step:

  1. Compute the target address in the ALU. TA = base + imm. For ld the offset is the I-type immediate; for sd it is the S-type immediate. Both reduce to "base register + signed offset", an ALU add.
  2. Convert byte → DW address. Route TA through a splitter taking bits [9:3] (7 bits) to the RAM A input.
  3. For sd: connect RD1 (the value of rs2) to Din, and assert MST = 1 (str). The write happens on the clock edge.
  4. For ld: assert MLD = 1 (ld) so Dout reflects cell A. Route Dout (64 bits) back to the register file write data through the write-back MUX.

Write-back MUX (M2R)

Loads must write the memory output into the register file, not the ALU result. The notes show two equivalent options:

  • Expand the existing WDsel MUX to add a memory input, or
  • Add a new 2-input MUX (M2R, "memory to register") that selects between the ALU result and RAM Dout; its output feeds the existing WDsel MUX.
flowchart LR
    ALU["ALU result"] --> M2R{"M2R MUX"}
    RAM["RAM Dout"] --> M2R
    M2R --> WD["WDsel MUX"]
    PC4["PC + 4"] --> WD
    WD --> RF["RegFile WD"]

    style M2R fill:#bbf,stroke:#333,stroke-width:2px

Worked example: sd then ld on the stack

main:
    li   sp, 1024          # initialize stack pointer (byte address)
    li   t0, 0xABCD        # value to store
    sd   t0, -8(sp)        # mem[sp-8] = t0
    ld   t1, -8(sp)        # t1 = mem[sp-8]  -> t1 == 0xABCD
    unimp

Trace of the sd t0, -8(sp):

Step Value
base = sp 1024 (byte addr)
imm-s -8
TA = 1024 + (-8) 1016 (byte addr)
DW addr = 1016 >> 3 127 → RAM A = 0b1111111
Din = RD1 = t0 0xABCD
MST 1 (write on clock edge)

The matching ld recomputes TA = 1016, the same DW addr 127, asserts MLD, and Dout = 0xABCD flows back through M2R/WDsel into t1.


5. Step 2 — Load Word (lw): Load Logic After the RAM

For lw the RAM still hands us a full 64-bit cell, but the program wants only 32 bits, sign-extended back to 64. The extra circuitry sits after the RAM Dout and is gated by the memory-size control line MSZ.

The four sub-steps

  1. Read the 64-bit cell. Same as ld: compute TA, shift to DW addr, assert MLD. (lw targets are 4-byte aligned, i.e. byte addresses that are multiples of 4.)
  2. Split Dout into two 32-bit halves. Use a splitter: bits [31:0] = lower word W0, bits [63:32] = upper word W1.
  3. Select the right half with byte-address bit 2. Feed W0 and W1 into a 2-input MUX whose selector is TA bit 2 (the green "word selector", written 2-2 in the notes). Bit 2 = 0 → lower word; bit 2 = 1 → upper word.
  4. Sign-extend 32 → 64. The chosen 32-bit word goes through a sign extender to a 64-bit value. This is the lw candidate.

Finally, an outer MSZ MUX selects which width to write back:

  • MSZ = 0b00 → the lb value (byte, sign-extended)
  • MSZ = 0b10 → the lw value (word, sign-extended)
  • MSZ = 0b11 → the ld value (the raw 64-bit Dout)

This ordering matches the RISC-V funct3 encoding so the control bits fall out naturally.

                       +-----+ Dout (64)
   RAM ---------------->|     |----+------------------------------+
                        +-----+    |                              |
                                   | split                        | (ld path: full 64)
                                   v                              v
                          [63:32]=W1   [31:0]=W0           +---------------+
                                |  \      /                |               |
                         bit 2  |   \    /                 |   MSZ MUX     |
                        (word   |    \  /                  | 00 = lb val   |--> to M2R / WDsel
                        select) |   +------+   32  +------+ | 10 = lw val   |
                                +-->| WSMUX|--/--->| Sign |->|11 = ld(64)   |
                                    +------+       | Ext  | |               |
                                                   +------+ +---------------+
                                                     64
flowchart TD
    RAM["RAM Dout (64)"] --> SPLIT["Splitter"]
    SPLIT --> W0["W0 = Dout[31:0]"]
    SPLIT --> W1["W1 = Dout[63:32]"]
    W0 --> WS{"Word-select MUX<br/>(sel = TA bit 2)"}
    W1 --> WS
    WS --> SX["Sign-extend 32 to 64"]
    SX --> MSZ{"MSZ MUX"}
    RAM --> MSZ
    MSZ --> OUT["to M2R / WDsel"]

    style WS fill:#bbf,stroke:#333,stroke-width:2px
    style SX fill:#bfb,stroke:#333,stroke-width:2px
    style MSZ fill:#f9f,stroke:#333,stroke-width:2px

Why sign-extend? lw loads a signed 32-bit value into a 64-bit register. If the word's bit 31 is 1 (negative), the upper 32 bits of the register must all be 1. The sign extender copies bit 31 into bits [63:32]. (lwu, an unsigned variant, would zero-extend instead — not required for this lab.)

Worked example: lw selecting the upper word

Suppose cell at DW addr 0 contains 0x_FFFF8000_00000007 (upper word 0xFFFF8000, lower word 0x00000007).

Instruction TA (byte) bit 2 selected word sign-extended result
lw rd, 0(x0) 0 0 0x00000007 (lower) 0x0000000000000007
lw rd, 4(x0) 4 1 0xFFFF8000 (upper) 0xFFFFFFFFFFFF8000

The second case shows the sign extension in action: bit 31 of 0xFFFF8000 is 1, so the upper half fills with Fs.


6. Step 3 — Store Word (sw): Store Logic Before the RAM

Storing a word is harder than loading one. The RAM can only write a whole 64-bit cell, but sw must change only 32 bits and leave the other 32 untouched. The solution is read-modify-write within a single clock cycle.

Key control trick. For sw (and sb), set both the RAM str and ld lines to 1. With ld = 1 the RAM presents the current cell contents on Dout (so we can read the half we must preserve), and with str = 1 it writes our newly assembled Din on the same clock edge.

The read-modify-write recipe

  1. Read the current cell D64cur from Dout (because MLD/ld is asserted).
  2. Extract the new word Wnew = bits [31:0] of RD1 (the value of rs2).
  3. Split D64cur into its two halves W0 = D64cur[31:0] and W1 = D64cur[63:32].
  4. Build two candidate 64-bit values with mergers:
  5. replace lower word: merge(Wnew, W1)[ W1 | Wnew ]
  6. replace upper word: merge(W0, Wnew)[ Wnew | W0 ]
  7. Choose with the word-index bit (TA bit 2). bit 2 = 0 → "replace lower"; bit 2 = 1 → "replace upper".
  8. MSZ store MUX picks between the sb, sw, and sd assembled values, then drives the RAM Din.
  RD1 (store value) --split--> Wnew = RD1[31:0]
                                  |
  RAM Dout = D64cur --split-->  W0=[31:0]   W1=[63:32]
                                  |   |        |
            merge(Wnew, W1) = [ W1 | Wnew ] --+--,
            merge(W0, Wnew) = [ Wnew | W0 ] ----+--> DWn MUX --+
                                  (sel = TA bit 2)             |
                                                               v
                                                        +-------------+
   sd value (full RD1 64) ----------------------------->|  MSZ MUX    |--> RAM Din
   sb value (assembled)   ----------------------------->| 00 sb /10 sw|
                                                        | 11 sd       |
                                                        +-------------+
flowchart TD
    RD1["RD1 (store value)"] --> WN["Wnew = RD1[31:0]"]
    DOUT["RAM Dout = D64cur"] --> S2["Splitter"]
    S2 --> W0["W0 = cur[31:0]"]
    S2 --> W1["W1 = cur[63:32]"]
    WN --> M1["merge(Wnew, W1)"]
    W1 --> M1
    W0 --> M2["merge(W0, Wnew)"]
    WN --> M2
    M1 --> DW{"DWn MUX<br/>(sel = TA bit 2)"}
    M2 --> DW
    DW --> MSZ{"MSZ store MUX"}
    RD1 --> MSZ
    MSZ --> DIN["RAM Din"]

    style DW fill:#bbf,stroke:#333,stroke-width:2px
    style MSZ fill:#f9f,stroke:#333,stroke-width:2px

Worked example: sw into the lower half

Cell at DW addr 0 currently holds 0x_DEADBEEF_11112222. Execute sw rs2, 0(x0) where rs2 = 0x_0000_0000_0000_0099.

Step Value
TA (byte) 0 → bit 2 = 0 → replace lower word
D64cur 0xDEADBEEF11112222
W1 = cur[63:32] (preserve) 0xDEADBEEF
Wnew = rs2[31:0] 0x00000099
assembled = merge(Wnew, W1) 0xDEADBEEF00000099
MSZ selects sw value → Din 0xDEADBEEF00000099

Only the lower 32 bits changed; the upper word 0xDEADBEEF was read back from the cell and re-merged, exactly as intended. Had TA been 4, bit 2 = 1 would have chosen merge(W0, Wnew) = 0x000000991111​2222, updating only the upper word.

Common mistake. Forgetting to assert ld during a store. If ld = 0, Dout is not driven with the current contents, so the "preserve" half is garbage and you clobber the neighboring word. For sw/sb you need both str = 1 and ld = 1.


7. The New Control Lines

Adding memory means extending the instruction decoder spreadsheet with new control outputs. As in earlier labs, set the new outputs to their inactive value (0) for every existing instruction so behavior does not change.

Control line Width Purpose Set when...
MLD 1 RAM ld (drive Dout) any load (ld/lw/lb) and any sub-word store (sw/sb)
MST 1 RAM str (write on edge) any store (sd/sw/sb)
MSZ 2 memory size select (b/w/d) loads and stores; 00=byte, 10=word, 11=dword
M2R 1 write-back source = memory any load

The MSZ encoding follows the RISC-V funct3 low bits so the decoder needs minimal logic:

Op size MSZ
lb / sb byte 0b00
lw / sw word 0b10
ld / sd dword 0b11

Example decoder rows (control outputs shown; x = don't-care input):

INUM Instr Mnem Format opcode funct3 M2R MLD MST MSZ
... ld MLDD I 0000011 011 1 1 0 11
... lw MLDW I 0000011 010 1 1 0 10
... sd MSTD S 0100011 011 0 0 1 11
... sw MSTW S 0100011 010 0 1 1 10

Note sw sets MLD = 1 and MST = 1 together (read-modify-write), while sd only needs MST = 1 because it overwrites the whole cell.


8. Putting It All Together (Top-Level View)

The lab notes sketched the top level as three blocks around the RAM: a Store block before Din, the RAM itself, and a Load block after Dout. The orange wiring is the store path (RD1 → store logic → Din), and the green wiring is the load path (Dout → load logic → register file).

flowchart LR
    subgraph ADDR["Address path"]
        ALU["ALU TA (64, byte addr)"] --> SP["splitter [9:3]"]
    end
    SP --> A["RAM A (7, DW addr)"]

    subgraph STORE["Store logic (before RAM)"]
        RD1["RD1 = rs2"] --> SL["read-modify-write<br/>+ MSZ store MUX"]
    end
    DOUT2["RAM Dout (current)"] --> SL
    SL --> DIN["RAM Din"]

    A --> RAM["RAM 64-bit cells<br/>str=MST  ld=MLD"]
    DIN --> RAM
    RAM --> DOUT["RAM Dout"]
    RAM --> DOUT2

    subgraph LOAD["Load logic (after RAM)"]
        DOUT --> LL["word select (bit 2)<br/>+ sign-extend + MSZ"]
    end
    LL --> M2R{"M2R MUX"}
    ALU --> M2R
    M2R --> WD["WDsel MUX"]
    WD --> RF["Register File WD"]

    style RAM fill:#f9f,stroke:#333,stroke-width:2px
    style SL fill:#fc9,stroke:#333,stroke-width:2px
    style LL fill:#bfb,stroke:#333,stroke-width:2px

Build order recap:

  1. Add the RAM (64-bit data, 7 addr bits) at the top level; wire CLK.
  2. Add the address splitter (TA[9:3]A).
  3. Wire ld/sd: RD1Din, MSTstr, MLDld; add the M2R write-back MUX.
  4. Test ld/sd end-to-end with a stack program before going further.
  5. Add load logic (split, word-select on bit 2, sign-extend, MSZ MUX) for lw.
  6. Add store logic (read-modify-write, DWn MUX on bit 2, MSZ MUX) for sw; remember str and ld both = 1.
  7. Derive lb/sb by adding a byte-level slice (bits [1:0] select the byte) into the same MUX structures.

9. Deriving lb / sb

The byte instructions reuse everything. The only changes:

  • Load (lb): instead of splitting into two 32-bit words, split the selected region down to a single byte using byte-address bits [1:0] (within a word) plus bit 2 (which word). The chosen 8-bit value goes through an 8→64 sign extender, then into the MSZ MUX at 0b00.
  • Store (sb): read-modify-write at byte granularity — preserve the other 7 bytes of the cell, replace just the target byte selected by bits [2:0], route through the MSZ store MUX at 0b00. As with sw, assert both str and ld.

The selector hierarchy is exactly the color-coded picture from Section 3: bit 2 picks the word, bits [1:0] pick the byte within it.

  64-bit cell selected by A (= byte >> 3)
     |
     +-- bit 2 picks 32-bit word  (lw / sw)
            |
            +-- bits [1:0] pick byte within word  (lb / sb)

Key Concepts

Concept Definition Example
Byte address An address that counts individual bytes; what registers hold sp + 8 is a byte address
Doubleword (DW) address A byte address divided by 8; indexes a 64-bit RAM cell 1016 >> 3 = 127
Address splitter Splitter taking TA[9:3] to produce the 7-bit RAM A input 1016 (byte) → 127 (DW)
Word selector Byte-address bit 2; picks lower vs. upper 32-bit word in a cell bit 2 = 1 → upper word
Byte selector Byte-address bits [1:0]; picks one of 4 bytes in a word [1:0] = 0b10 → byte 2
Sign extension Replicating bit 31 (or bit 7) into the upper bits for signed loads 0xFFFF80000xFFFFFFFFFFFF8000
Read-modify-write Read a cell, change part of it, write it back in one cycle sw updating one 32-bit half
MSZ Memory-size control: 00=byte, 10=word, 11=dword lw uses MSZ = 0b10
M2R Mux control selecting RAM output (not ALU) as write-back data set for every load
MLD / MST RAM ld / str enables; sub-word stores assert both sw: MLD=1, MST=1

Practice Problems

Problem 1: Byte → Doubleword address

A program executes ld t0, 24(s0) with s0 = 1000. What byte address does the ALU compute, and what 7-bit value appears on the RAM A input?

Click to reveal solution
Byte address TA = base + imm = 1000 + 24 = 1024
DW address     = 1024 >> 3   = 128
Wait — `128` does not fit in 7 bits (`0..127`). With only 7 address bits the RAM has 128 cells (DW addresses `0..127`), so `128` is out of range. In practice the splitter takes `TA[9:3]`, and bit 10 (worth `1024` in byte terms = DW `128`) is dropped, wrapping to `A = 0`. **Takeaway:** 7 address bits give exactly 1024 bytes of memory; a stack initialized to `1024` should grow *downward* (`addi sp, sp, -N`) so accesses land in range. For an in-range example, `ld t0, 16(s0)` with `s0 = 1000` gives `TA = 1016`, `DW = 1016 >> 3 = 127`, so `A = 0b1111111`.

Problem 2: Which word does lw select?

A cell at DW address 3 holds 0x_CAFEF00D_DEADBEEF. The processor executes lw rd, 4(x0)... then lw rd, 24(x0). For each, give the byte address, the value of bit 2, the selected 32-bit word, and the 64-bit register result.

Click to reveal solution `lw rd, 4(x0)`: `TA = 4`, `DW = 4>>3 = 0` (cell 0, not cell 3) — different cell. bit 2 of `4` = 1 → would select the **upper** word of *cell 0*. `lw rd, 24(x0)`: `TA = 24`, `DW = 24>>3 = 3` → **cell 3**. bit 2 of `24` (`0b11000`) = `0` → select the **lower** word = `0xDEADBEEF`.
0xDEADBEEF  has bit 31 = 1 (negative), so sign-extend:
result = 0xFFFFFFFF_DEADBEEF
If instead we did `lw rd, 28(x0)`: `DW = 28>>3 = 3`, bit 2 of `28` (`0b11100`) = 1 → upper word `0xCAFEF00D`, bit 31 = 1 → result `0xFFFFFFFF_CAFEF00D`.

Problem 3: Store word read-modify-write

Cell at DW address 2 holds 0x_00000000_FFFFFFFF. Execute sw rs2, 16(x0) with rs2 = 0x_0000_0000_8000_0001. Show the assembled Din.

Click to reveal solution
TA = 16, DW = 16 >> 3 = 2 (cell 2). bit 2 of 16 (0b10000) = 0 -> replace LOWER word.

D64cur = 0x00000000_FFFFFFFF
  W1 = cur[63:32] = 0x00000000   (preserve)
  W0 = cur[31:0]  = 0xFFFFFFFF   (will be replaced)

Wnew = rs2[31:0] = 0x80000001

assembled = merge(Wnew, W1) = [ W1 | Wnew ] = 0x00000000_80000001
Din = 0x0000000080000001   (only lower word changed)
Note the upper word's `0x00000000` is preserved exactly because we read it back from `Dout` (which requires `ld = 1` during the store).

Problem 4: Why both str and ld for sw?

Explain in one or two sentences why sw must assert the RAM ld line in addition to str, while sd only needs str.

Click to reveal solution `sw` only changes 32 of the 64 bits in a cell, so it must **read** the current cell contents (`ld = 1` drives `Dout`) to preserve the *other* 32 bits, then **write** the re-assembled 64-bit value (`str = 1`). `sd` overwrites the entire 64-bit cell, so there is nothing to preserve and no read is needed — `str` alone suffices.

Problem 5: Control-line table

Fill in M2R, MLD, MST, and MSZ for lb, sb, and lw.

Click to reveal solution | Instr | `M2R` | `MLD` | `MST` | `MSZ` | |-------|-------|-------|-------|-------| | `lb` | 1 | 1 | 0 | 00 | | `sb` | 0 | 1 | 1 | 00 | | `lw` | 1 | 1 | 0 | 10 | `lb` is a load → write memory back to register (`M2R=1`), read enabled (`MLD=1`), no write (`MST=0`), byte size (`00`). `sb` is a sub-word store → read-modify-write so **both** `MLD=1` and `MST=1`, not a register write-back (`M2R=0`). `lw` mirrors `lb` but with word size (`10`).

Problem 6: Spot the bug

A student's lw returns correct values for offsets 0, 8, 16, ... but wrong (off-by-one-word) values for offsets 4, 12, 20, .... The address splitter is correct. What is the most likely wiring mistake?

Click to reveal solution The **word-select MUX** is using the wrong selector bit (or has its inputs swapped). The lower/upper-word choice must come from **byte-address bit 2** (`TA[2]`). Offsets that are multiples of 8 all have bit 2 = 0 (lower word) and happen to work; offsets `4, 12, 20` have bit 2 = 1 and need the **upper** word. If the MUX selector is wired to a different bit (e.g., bit 3) or the two 32-bit inputs are swapped, exactly these cases break. Fix: drive the MUX select from `TA[2]`, with input 0 = `Dout[31:0]` (lower) and input 1 = `Dout[63:32]` (upper).

Further Reading


Summary

  1. Data memory is a 64-bit-wide Digital RAM (data bits = 64, addr bits = 7 → 1024 bytes); keep the RAM component at the top level so you can inspect it while simulating.

  2. Registers hold byte addresses, but the RAM is indexed by doubleword addresses. Shift the ALU's target address right by 3 (take bits [9:3]) with a splitter to get the 7-bit RAM A input.

  3. The low address bits are selectors, not part of the RAM address: bit 2 chooses the 32-bit word (lower vs. upper) within a cell, and bits [1:0] choose the byte within a word.

  4. ld/sd come almost for free with a 64-bit RAM: ALU → splitter → RAM, RD1Din, and route Dout back through an M2R/WDsel MUX to the register file.

  5. lw adds load logic after the RAM: split Dout into two 32-bit halves, pick one with bit 2, sign-extend 32 → 64, then select width with the MSZ MUX.

  6. sw adds store logic before the RAM and is a read-modify-write: assert both str and ld, preserve the untouched half, merge in RD1[31:0], choose with bit 2, and feed the MSZ store MUX into Din.

  7. New control lines MLD, MST, MSZ, and M2R extend the decoder spreadsheet; the MSZ encoding (00/10/11 for byte/word/dword) follows the RISC-V funct3 bits to keep decode logic simple.

  8. lb/sb follow the same pattern at byte granularity, reusing the selector hierarchy (bit 2 for word, bits [1:0] for byte) and the same MUX/sign-extender structures.