Lab: Data Memory — LW and SW¶

Overview¶

This hands-on lab session adds data memory to the single-cycle RISC-V processor we have been building across Lab 10 and Project 6. We use a Digital RAM component configured with 64-bit words so that ld/sd (load/store doubleword) work almost for free. We then wrap the RAM with load logic (after the RAM) and store logic (before the RAM) so the same 64-bit memory can also service lw/sw (word) and, by the same pattern, lb/sb (byte). The central idea is address arithmetic: the ALU produces a 64-bit byte address, but the RAM is indexed by doubleword address, so we shift right by 3, then use the low address bits to pick which sub-word inside a 64-bit cell we are touching. This page walks through the datapath, the memory-layout intuition, the control lines (MLD, MST, MSZ, M2R), and the step-by-step circuit construction with worked examples and common mistakes.

Learning Objectives¶

Configure the Digital RAM (Separated Ports) component for use as processor data memory and state its capacity.
Convert a 64-bit byte address (from the ALU) into the doubleword address the RAM expects.
Identify which low address bits select a word vs. a byte within a 64-bit memory cell.
Wire the ld/sd datapath: ALU → address splitter → RAM, and RAM Dout → register file write-back MUX.
Build the load logic for lw: split the 64-bit Dout into two 32-bit halves, select with address bit 2, and sign-extend to 64 bits.
Build the store logic for sw: read-modify-write a 64-bit cell so only the targeted 32-bit half is updated.
Add and set the new control lines (MLD, MST, MSZ, M2R) in the instruction decoder spreadsheet.
Recognize and avoid common wiring/control mistakes (wrong shift amount, misaligned selector bit, forgetting str+ld together for stores).

Prerequisites¶

A working Project 6 processor through JAL/JALR and conditional branches.
RISC-V instruction formats, especially I-type (loads) and S-type (stores) (Project 4 / Lab 03).
Byte addressing, words vs. doublewords, little-endian layout (Project 3).
Sign extension, shifting, and masking (Project 3, Lab 09).
Multiplexers, splitters, mergers, and the RAM component in Digital (Lab 09, Project 5).
The register-file write-back path and WDsel MUX from Lab 10 / Project 6.

1. Where Data Memory Fits¶

So far our processor can fetch instructions, decode them, read/write registers, run the ALU, and change control flow with jumps and branches. The last missing capability is touching memory: loading values from and storing values to RAM. Our test programs use this for the stack (saving ra and locals), but the same memory could back a heap.

RISC-V memory instructions come in matched load/store pairs, organized by data size:

Mnemonic	Meaning	Size	Format	Action
`ld`	load doubleword	64-bit	I-type	`rd = mem[rs1 + imm]`
`sd`	store doubleword	64-bit	S-type	`mem[rs1 + imm] = rs2`
`lw`	load word	32-bit	I-type	`rd = sext(mem32[rs1 + imm])`
`sw`	store word	32-bit	S-type	`mem32[rs1 + imm] = rs2[31:0]`
`lb`	load byte	8-bit	I-type	`rd = sext(mem8[rs1 + imm])`
`sb`	store byte	8-bit	S-type	`mem8[rs1 + imm] = rs2[7:0]`

Lab strategy (incremental): get ld/sd working first with a plain 64-bit RAM, then add the wrapper logic for lw/sw. Once lw/sw work, lb/sb follow the exact same pattern with a narrower slice. Do not try to build everything at once.

flowchart LR
    A["Register File<br/>RD0 = base addr"] --> B["ALU<br/>TA = base + imm"]
    I["ImmDecoder<br/>imm-i / imm-s"] --> B
    B --> C["Addr splitter<br/>byte to DW addr"]
    C --> D["RAM<br/>64-bit cells"]
    R1["RD1 = store value"] --> D
    D --> E["Load logic<br/>select + sign-extend"]
    E --> F["Write-back MUX<br/>M2R / WDsel"]
    F --> A

    style D fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:2px

2. The Digital RAM Component¶

We use Digital's RAM, Separated Ports component. "Separated ports" means it has distinct inputs for the address, the data to write (Din), and a separate data output (Dout) — exactly what a processor wants.

Configure it as:

Setting	Value	Why
Data bits	`64`	One cell holds a full RISC-V doubleword, so `ld`/`sd` are trivial.
Addr bits	`7`	`2^7 = 128` cells × 8 bytes = 1024 bytes of data memory.

The component's pins, as drawn in the lab notes:

            +-----------------------+
   A  --/-->| A (addr)         Dout |--/--> 64 bits out
   7        |                       |   64
            |                       |
 Din --/--> | Din               RAM |
   64       |                       |
 MST -----> | str                   |
            |                       |
 CLK -----> | clk                   |
            |                       |
 MLD -----> | ld                    |
            +-----------------------+
            data bits: 64
            addr bits: 7

Pin	Driven by	Meaning
`A`	address splitter output	doubleword address (7 bits) of the cell to access
`Din`	store logic	64-bit value to write
`Dout`	(output)	64-bit value read from cell `A`
`str`	`MST` control line	when 1, write `Din` into cell `A` on the clock edge
`clk`	processor `CLK`	synchronous write timing
`ld`	`MLD` control line	when 1, drive `Dout` with the contents of cell `A`

Keep the RAM at the top level. A critical lab tip: even though you will wrap the RAM with extra logic, leave the RAM component itself at the top level of your processor circuit. That is the only way Digital lets you open the RAM during simulation to inspect (and pre-load) its contents while debugging. Put the load logic after it and the store logic before it, but do not bury the RAM inside a sub-circuit.

3. Byte Addresses vs. Doubleword Addresses¶

This is the single most important idea in the lab, and it caused the most confusion in class.

Registers always hold byte addresses. When a program computes sp + 8, that is a byte address. But our RAM is an array of 64-bit (8-byte) cells, and its A input selects a cell, not a byte. So we must convert.

A byte address divided by 8 gives the doubleword (DW) address:

DW address = byte address / 8 = byte address >> 3

Dividing by 8 just drops the low 3 bits. In hardware we do not need a divider — we use a splitter to take bits [9:3] of the 64-bit ALU output and feed those 7 bits to the RAM A input. The low 3 bits [2:0] are the offset inside the cell and are not sent to the RAM address; instead they tell us which sub-word we are touching.

 byte addr bit:  ... 9  8  7  6  5  4  3 | 2 | 1  0
                 \________ DW addr ______/ |word|byte|
                        (-> RAM A, 7 bits)  |sel | sel |

The notes drew this as the ALU output TA (64 bits, a byte address) entering a splitter labeled 3-9, whose 7-bit output drives the RAM A input.

                  +-----+
  RD0 (rs1) ----->|  A  |
  base addr       | ALU |  TA  +--------+      A
                  |  B  |--/--->| 3 - 9  |--/-->  (RAM addr)
  imm-i/imm-s --->+-----+  64   +--------+  7
                          byte    splitter   DW addr
                          addr

The three address "views"¶

The class drew the same memory three ways at once. The picture below mirrors that "memory layout" diagram: the same storage seen as raw bytes, as 32-bit words, and as 64-bit doublewords.

 byte addr   word addr   dword addr
 ----------  ---------   ----------
   ...
    16  --*    (word 4)    (dword 2)
    15
    14   } word 3 ----,
    13                 } dword 1
    12  ----------'
    11
    10   } word 2 ---,
     9                } dword 1
     8  -----------'
     7
     6   } word 1 ---,
     5                } dword 0
     4  -----------'
     3
     2   } word 0 ---,
     1                } dword 0
     0  -----------'

Reading the relationships off the diagram:

Address kind	Conversion from byte address	Low bits used as offset
byte	(itself)	—
word	`byte >> 2`	bits `[1:0]` = byte within word
doubleword	`byte >> 3`	bits `[2:0]` = byte within dword

Selector bits, color-coded as in the notes¶

The handwritten layout used colored boxes to show how the low address bits act as selectors:

Word selector (green) = byte address bit 2. Within one 64-bit cell there are two 32-bit words: bit 2 = 0 selects the lower word (bytes 0–3), bit 2 = 1 selects the upper word (bytes 4–7).
Byte selector (yellow) = byte address bits [1:0]. Within one 32-bit word there are four bytes; bits [1:0] pick which.

  64-bit cell  =  [ byte7 byte6 byte5 byte4 | byte3 byte2 byte1 byte0 ]
                  \________ upper word _____/ \_______ lower word ____/
                          (bit 2 = 1)                 (bit 2 = 0)
                  \__ b[1:0] pick byte __/   \__ b[1:0] pick byte __/

Common mistake. Students shifted by 2 (word divide) instead of 3 (doubleword divide) when feeding the RAM A input. Because the RAM holds 64-bit cells, the address must be a doubleword address — shift right by 3, take bits [9:3].

4. Step 1 — Implementing `ld` / `sd`¶

Start with the easy case: 64-bit loads and stores. No sub-word logic is needed because the RAM cell is a doubleword.

Datapath¶

  RD0 (rs1) --+
  base addr   |   +-----+
              +-->|  A  |  TA   +-------+   A    +---------+ Dout  +-----------+
                  | ALU |--/--->| 3 - 9 |--/---->|   RAM   |--/--->| WDsel MUX |--> RegFile WD
  imm-i  ------->|  B  |  64    +-------+   7    |  64-bit |  64   +-----------+
  (ld, I-type)   +-----+                        |  cells  |
  imm-s  -------> (for sd, B = imm-s)            |         |
                                       RD1 ----> | Din     |
                                  (sd store val) |         |
                                       MST ----> | str     |
                                       CLK ----> | clk     |
                                       MLD ----> | ld      |
                                                 +---------+

Step-by-step:

Compute the target address in the ALU. TA = base + imm. For ld the offset is the I-type immediate; for sd it is the S-type immediate. Both reduce to "base register + signed offset", an ALU add.
Convert byte → DW address. Route TA through a splitter taking bits [9:3] (7 bits) to the RAM A input.
For sd: connect RD1 (the value of rs2) to Din, and assert MST = 1 (str). The write happens on the clock edge.
For ld: assert MLD = 1 (ld) so Dout reflects cell A. Route Dout (64 bits) back to the register file write data through the write-back MUX.

Write-back MUX (`M2R`)¶

Loads must write the memory output into the register file, not the ALU result. The notes show two equivalent options:

Expand the existing WDsel MUX to add a memory input, or
Add a new 2-input MUX (M2R, "memory to register") that selects between the ALU result and RAM Dout; its output feeds the existing WDsel MUX.

flowchart LR
    ALU["ALU result"] --> M2R{"M2R MUX"}
    RAM["RAM Dout"] --> M2R
    M2R --> WD["WDsel MUX"]
    PC4["PC + 4"] --> WD
    WD --> RF["RegFile WD"]

    style M2R fill:#bbf,stroke:#333,stroke-width:2px

Worked example: `sd` then `ld` on the stack¶

main:
    li   sp, 1024          # initialize stack pointer (byte address)
    li   t0, 0xABCD        # value to store
    sd   t0, -8(sp)        # mem[sp-8] = t0
    ld   t1, -8(sp)        # t1 = mem[sp-8]  -> t1 == 0xABCD
    unimp

Trace of the sd t0, -8(sp):

Step	Value
base = `sp`	`1024` (byte addr)
imm-s	`-8`
`TA = 1024 + (-8)`	`1016` (byte addr)
DW addr = `1016 >> 3`	`127` → RAM `A` = `0b1111111`
`Din = RD1 = t0`	`0xABCD`
`MST`	`1` (write on clock edge)

The matching ld recomputes TA = 1016, the same DW addr 127, asserts MLD, and Dout = 0xABCD flows back through M2R/WDsel into t1.

5. Step 2 — Load Word (`lw`): Load Logic After the RAM¶

For lw the RAM still hands us a full 64-bit cell, but the program wants only 32 bits, sign-extended back to 64. The extra circuitry sits after the RAM Dout and is gated by the memory-size control line MSZ.

The four sub-steps¶

Read the 64-bit cell. Same as ld: compute TA, shift to DW addr, assert MLD. (lw targets are 4-byte aligned, i.e. byte addresses that are multiples of 4.)
Split Dout into two 32-bit halves. Use a splitter: bits [31:0] = lower word W0, bits [63:32] = upper word W1.
Select the right half with byte-address bit 2. Feed W0 and W1 into a 2-input MUX whose selector is TA bit 2 (the green "word selector", written 2-2 in the notes). Bit 2 = 0 → lower word; bit 2 = 1 → upper word.
Sign-extend 32 → 64. The chosen 32-bit word goes through a sign extender to a 64-bit value. This is the lw candidate.

Finally, an outer MSZ MUX selects which width to write back:

MSZ = 0b00 → the lb value (byte, sign-extended)
MSZ = 0b10 → the lw value (word, sign-extended)
MSZ = 0b11 → the ld value (the raw 64-bit Dout)

This ordering matches the RISC-V funct3 encoding so the control bits fall out naturally.

                       +-----+ Dout (64)
   RAM ---------------->|     |----+------------------------------+
                        +-----+    |                              |
                                   | split                        | (ld path: full 64)
                                   v                              v
                          [63:32]=W1   [31:0]=W0           +---------------+
                                |  \      /                |               |
                         bit 2  |   \    /                 |   MSZ MUX     |
                        (word   |    \  /                  | 00 = lb val   |--> to M2R / WDsel
                        select) |   +------+   32  +------+ | 10 = lw val   |
                                +-->| WSMUX|--/--->| Sign |->|11 = ld(64)   |
                                    +------+       | Ext  | |               |
                                                   +------+ +---------------+
                                                     64

flowchart TD
    RAM["RAM Dout (64)"] --> SPLIT["Splitter"]
    SPLIT --> W0["W0 = Dout[31:0]"]
    SPLIT --> W1["W1 = Dout[63:32]"]
    W0 --> WS{"Word-select MUX<br/>(sel = TA bit 2)"}
    W1 --> WS
    WS --> SX["Sign-extend 32 to 64"]
    SX --> MSZ{"MSZ MUX"}
    RAM --> MSZ
    MSZ --> OUT["to M2R / WDsel"]

    style WS fill:#bbf,stroke:#333,stroke-width:2px
    style SX fill:#bfb,stroke:#333,stroke-width:2px
    style MSZ fill:#f9f,stroke:#333,stroke-width:2px

Why sign-extend? lw loads a signed 32-bit value into a 64-bit register. If the word's bit 31 is 1 (negative), the upper 32 bits of the register must all be 1. The sign extender copies bit 31 into bits [63:32]. (lwu, an unsigned variant, would zero-extend instead — not required for this lab.)

Worked example: `lw` selecting the upper word¶

Suppose cell at DW addr 0 contains 0x_FFFF8000_00000007 (upper word 0xFFFF8000, lower word 0x00000007).

Instruction	`TA` (byte)	bit 2	selected word	sign-extended result
`lw rd, 0(x0)`	`0`	`0`	`0x00000007` (lower)	`0x0000000000000007`
`lw rd, 4(x0)`	`4`	`1`	`0xFFFF8000` (upper)	`0xFFFFFFFFFFFF8000`

The second case shows the sign extension in action: bit 31 of 0xFFFF8000 is 1, so the upper half fills with Fs.

6. Step 3 — Store Word (`sw`): Store Logic Before the RAM¶

Storing a word is harder than loading one. The RAM can only write a whole 64-bit cell, but sw must change only 32 bits and leave the other 32 untouched. The solution is read-modify-write within a single clock cycle.

Key control trick. For sw (and sb), set both the RAM str and ld lines to 1. With ld = 1 the RAM presents the current cell contents on Dout (so we can read the half we must preserve), and with str = 1 it writes our newly assembled Din on the same clock edge.

The read-modify-write recipe¶

Read the current cell D64cur from Dout (because MLD/ld is asserted).
Extract the new word Wnew = bits [31:0] of RD1 (the value of rs2).
Split D64cur into its two halves W0 = D64cur[31:0] and W1 = D64cur[63:32].
Build two candidate 64-bit values with mergers:
replace lower word: merge(Wnew, W1) → [ W1 | Wnew ]
replace upper word: merge(W0, Wnew) → [ Wnew | W0 ]
Choose with the word-index bit (TA bit 2). bit 2 = 0 → "replace lower"; bit 2 = 1 → "replace upper".
MSZ store MUX picks between the sb, sw, and sd assembled values, then drives the RAM Din.

  RD1 (store value) --split--> Wnew = RD1[31:0]
                                  |
  RAM Dout = D64cur --split-->  W0=[31:0]   W1=[63:32]
                                  |   |        |
            merge(Wnew, W1) = [ W1 | Wnew ] --+--,
            merge(W0, Wnew) = [ Wnew | W0 ] ----+--> DWn MUX --+
                                  (sel = TA bit 2)             |
                                                               v
                                                        +-------------+
   sd value (full RD1 64) ----------------------------->|  MSZ MUX    |--> RAM Din
   sb value (assembled)   ----------------------------->| 00 sb /10 sw|
                                                        | 11 sd       |
                                                        +-------------+

flowchart TD
    RD1["RD1 (store value)"] --> WN["Wnew = RD1[31:0]"]
    DOUT["RAM Dout = D64cur"] --> S2["Splitter"]
    S2 --> W0["W0 = cur[31:0]"]
    S2 --> W1["W1 = cur[63:32]"]
    WN --> M1["merge(Wnew, W1)"]
    W1 --> M1
    W0 --> M2["merge(W0, Wnew)"]
    WN --> M2
    M1 --> DW{"DWn MUX<br/>(sel = TA bit 2)"}
    M2 --> DW
    DW --> MSZ{"MSZ store MUX"}
    RD1 --> MSZ
    MSZ --> DIN["RAM Din"]

    style DW fill:#bbf,stroke:#333,stroke-width:2px
    style MSZ fill:#f9f,stroke:#333,stroke-width:2px

Worked example: `sw` into the lower half¶

Cell at DW addr 0 currently holds 0x_DEADBEEF_11112222. Execute sw rs2, 0(x0) where rs2 = 0x_0000_0000_0000_0099.

Step	Value
`TA` (byte)	`0` → bit 2 = `0` → replace lower word
`D64cur`	`0xDEADBEEF11112222`
`W1 = cur[63:32]` (preserve)	`0xDEADBEEF`
`Wnew = rs2[31:0]`	`0x00000099`
assembled = `merge(Wnew, W1)`	`0xDEADBEEF00000099`
`MSZ` selects sw value → `Din`	`0xDEADBEEF00000099`

Only the lower 32 bits changed; the upper word 0xDEADBEEF was read back from the cell and re-merged, exactly as intended. Had TA been 4, bit 2 = 1 would have chosen merge(W0, Wnew) = 0x0000009911112222, updating only the upper word.

Common mistake. Forgetting to assert ld during a store. If ld = 0, Dout is not driven with the current contents, so the "preserve" half is garbage and you clobber the neighboring word. For sw/sb you need both str = 1 and ld = 1.

7. The New Control Lines¶

Adding memory means extending the instruction decoder spreadsheet with new control outputs. As in earlier labs, set the new outputs to their inactive value (0) for every existing instruction so behavior does not change.

Control line	Width	Purpose	Set when...
`MLD`	1	RAM `ld` (drive `Dout`)	any load (`ld`/`lw`/`lb`) and any sub-word store (`sw`/`sb`)
`MST`	1	RAM `str` (write on edge)	any store (`sd`/`sw`/`sb`)
`MSZ`	2	memory size select (b/w/d)	loads and stores; `00`=byte, `10`=word, `11`=dword
`M2R`	1	write-back source = memory	any load

The MSZ encoding follows the RISC-V funct3 low bits so the decoder needs minimal logic:

Op	size	`MSZ`
`lb` / `sb`	byte	`0b00`
`lw` / `sw`	word	`0b10`
`ld` / `sd`	dword	`0b11`

Example decoder rows (control outputs shown; x = don't-care input):

INUM	Instr	Mnem	Format	opcode	funct3	`M2R`	`MLD`	`MST`	`MSZ`
...	`ld`	MLDD	I	0000011	011	1	1	0	11
...	`lw`	MLDW	I	0000011	010	1	1	0	10
...	`sd`	MSTD	S	0100011	011	0	0	1	11
...	`sw`	MSTW	S	0100011	010	0	1	1	10

Note sw sets MLD = 1 and MST = 1 together (read-modify-write), while sd only needs MST = 1 because it overwrites the whole cell.

8. Putting It All Together (Top-Level View)¶

The lab notes sketched the top level as three blocks around the RAM: a Store block before Din, the RAM itself, and a Load block after Dout. The orange wiring is the store path (RD1 → store logic → Din), and the green wiring is the load path (Dout → load logic → register file).

flowchart LR
    subgraph ADDR["Address path"]
        ALU["ALU TA (64, byte addr)"] --> SP["splitter [9:3]"]
    end
    SP --> A["RAM A (7, DW addr)"]

    subgraph STORE["Store logic (before RAM)"]
        RD1["RD1 = rs2"] --> SL["read-modify-write<br/>+ MSZ store MUX"]
    end
    DOUT2["RAM Dout (current)"] --> SL
    SL --> DIN["RAM Din"]

    A --> RAM["RAM 64-bit cells<br/>str=MST  ld=MLD"]
    DIN --> RAM
    RAM --> DOUT["RAM Dout"]
    RAM --> DOUT2

    subgraph LOAD["Load logic (after RAM)"]
        DOUT --> LL["word select (bit 2)<br/>+ sign-extend + MSZ"]
    end
    LL --> M2R{"M2R MUX"}
    ALU --> M2R
    M2R --> WD["WDsel MUX"]
    WD --> RF["Register File WD"]

    style RAM fill:#f9f,stroke:#333,stroke-width:2px
    style SL fill:#fc9,stroke:#333,stroke-width:2px
    style LL fill:#bfb,stroke:#333,stroke-width:2px

Build order recap:

Add the RAM (64-bit data, 7 addr bits) at the top level; wire CLK.
Add the address splitter (TA[9:3] → A).
Wire ld/sd: RD1→Din, MST→str, MLD→ld; add the M2R write-back MUX.
Test ld/sd end-to-end with a stack program before going further.
Add load logic (split, word-select on bit 2, sign-extend, MSZ MUX) for lw.
Add store logic (read-modify-write, DWn MUX on bit 2, MSZ MUX) for sw; remember str and ld both = 1.
Derive lb/sb by adding a byte-level slice (bits [1:0] select the byte) into the same MUX structures.

9. Deriving `lb` / `sb`¶

The byte instructions reuse everything. The only changes:

Load (lb): instead of splitting into two 32-bit words, split the selected region down to a single byte using byte-address bits [1:0] (within a word) plus bit 2 (which word). The chosen 8-bit value goes through an 8→64 sign extender, then into the MSZ MUX at 0b00.
Store (sb): read-modify-write at byte granularity — preserve the other 7 bytes of the cell, replace just the target byte selected by bits [2:0], route through the MSZ store MUX at 0b00. As with sw, assert both str and ld.

The selector hierarchy is exactly the color-coded picture from Section 3: bit 2 picks the word, bits [1:0] pick the byte within it.

  64-bit cell selected by A (= byte >> 3)
     |
     +-- bit 2 picks 32-bit word  (lw / sw)
            |
            +-- bits [1:0] pick byte within word  (lb / sb)

Key Concepts¶

Concept	Definition	Example
Byte address	An address that counts individual bytes; what registers hold	`sp + 8` is a byte address
Doubleword (DW) address	A byte address divided by 8; indexes a 64-bit RAM cell	`1016 >> 3 = 127`
Address splitter	Splitter taking `TA[9:3]` to produce the 7-bit RAM `A` input	`1016` (byte) → `127` (DW)
Word selector	Byte-address bit 2; picks lower vs. upper 32-bit word in a cell	bit 2 = 1 → upper word
Byte selector	Byte-address bits [1:0]; picks one of 4 bytes in a word	`[1:0] = 0b10` → byte 2
Sign extension	Replicating bit 31 (or bit 7) into the upper bits for signed loads	`0xFFFF8000` → `0xFFFFFFFFFFFF8000`
Read-modify-write	Read a cell, change part of it, write it back in one cycle	`sw` updating one 32-bit half
`MSZ`	Memory-size control: `00`=byte, `10`=word, `11`=dword	`lw` uses `MSZ = 0b10`
`M2R`	Mux control selecting RAM output (not ALU) as write-back data	set for every load
`MLD` / `MST`	RAM `ld` / `str` enables; sub-word stores assert both	`sw`: `MLD=1, MST=1`

Practice Problems¶

Problem 1: Byte → Doubleword address¶

A program executes ld t0, 24(s0) with s0 = 1000. What byte address does the ALU compute, and what 7-bit value appears on the RAM A input?

Click to reveal solution

Byte address TA = base + imm = 1000 + 24 = 1024
DW address     = 1024 >> 3   = 128

Wait — `128` does not fit in 7 bits (`0..127`). With only 7 address bits the RAM has 128 cells (DW addresses `0..127`), so `128` is out of range. In practice the splitter takes `TA[9:3]`, and bit 10 (worth `1024` in byte terms = DW `128`) is dropped, wrapping to `A = 0`. **Takeaway:** 7 address bits give exactly 1024 bytes of memory; a stack initialized to `1024` should grow *downward* (`addi sp, sp, -N`) so accesses land in range. For an in-range example, `ld t0, 16(s0)` with `s0 = 1000` gives `TA = 1016`, `DW = 1016 >> 3 = 127`, so `A = 0b1111111`.

Problem 2: Which word does `lw` select?¶

A cell at DW address 3 holds 0x_CAFEF00D_DEADBEEF. The processor executes lw rd, 4(x0)... then lw rd, 24(x0). For each, give the byte address, the value of bit 2, the selected 32-bit word, and the 64-bit register result.

Click to reveal solution

`lw rd, 4(x0)`: `TA = 4`, `DW = 4>>3 = 0` (cell 0, not cell 3) — different cell. bit 2 of `4` = 1 → would select the **upper** word of *cell 0*. `lw rd, 24(x0)`: `TA = 24`, `DW = 24>>3 = 3` → **cell 3**. bit 2 of `24` (`0b11000`) = `0` → select the **lower** word = `0xDEADBEEF`.

0xDEADBEEF  has bit 31 = 1 (negative), so sign-extend:
result = 0xFFFFFFFF_DEADBEEF

If instead we did `lw rd, 28(x0)`: `DW = 28>>3 = 3`, bit 2 of `28` (`0b11100`) = 1 → upper word `0xCAFEF00D`, bit 31 = 1 → result `0xFFFFFFFF_CAFEF00D`.

Problem 3: Store word read-modify-write¶

Cell at DW address 2 holds 0x_00000000_FFFFFFFF. Execute sw rs2, 16(x0) with rs2 = 0x_0000_0000_8000_0001. Show the assembled Din.

Click to reveal solution

TA = 16, DW = 16 >> 3 = 2 (cell 2). bit 2 of 16 (0b10000) = 0 -> replace LOWER word.

D64cur = 0x00000000_FFFFFFFF
  W1 = cur[63:32] = 0x00000000   (preserve)
  W0 = cur[31:0]  = 0xFFFFFFFF   (will be replaced)

Wnew = rs2[31:0] = 0x80000001

assembled = merge(Wnew, W1) = [ W1 | Wnew ] = 0x00000000_80000001
Din = 0x0000000080000001   (only lower word changed)

Note the upper word's `0x00000000` is preserved exactly because we read it back from `Dout` (which requires `ld = 1` during the store).

Problem 4: Why both `str` and `ld` for `sw`?¶

Explain in one or two sentences why sw must assert the RAM ld line in addition to str, while sd only needs str.

Click to reveal solution

`sw` only changes 32 of the 64 bits in a cell, so it must **read** the current cell contents (`ld = 1` drives `Dout`) to preserve the *other* 32 bits, then **write** the re-assembled 64-bit value (`str = 1`). `sd` overwrites the entire 64-bit cell, so there is nothing to preserve and no read is needed — `str` alone suffices.

Problem 5: Control-line table¶

Fill in M2R, MLD, MST, and MSZ for lb, sb, and lw.

Click to reveal solution

| Instr | `M2R` | `MLD` | `MST` | `MSZ` | |-------|-------|-------|-------|-------| | `lb` | 1 | 1 | 0 | 00 | | `sb` | 0 | 1 | 1 | 00 | | `lw` | 1 | 1 | 0 | 10 | `lb` is a load → write memory back to register (`M2R=1`), read enabled (`MLD=1`), no write (`MST=0`), byte size (`00`). `sb` is a sub-word store → read-modify-write so **both** `MLD=1` and `MST=1`, not a register write-back (`M2R=0`). `lw` mirrors `lb` but with word size (`10`).

Problem 6: Spot the bug¶

A student's lw returns correct values for offsets 0, 8, 16, ... but wrong (off-by-one-word) values for offsets 4, 12, 20, .... The address splitter is correct. What is the most likely wiring mistake?

Click to reveal solution

The **word-select MUX** is using the wrong selector bit (or has its inputs swapped). The lower/upper-word choice must come from **byte-address bit 2** (`TA[2]`). Offsets that are multiples of 8 all have bit 2 = 0 (lower word) and happen to work; offsets `4, 12, 20` have bit 2 = 1 and need the **upper** word. If the MUX selector is wired to a different bit (e.g., bit 3) or the two 32-bit inputs are swapped, exactly these cases break. Fix: drive the MUX select from `TA[2]`, with input 0 = `Dout[31:0]` (lower) and input 1 = `Dout[63:32]` (upper).

Summary¶

Data memory is a 64-bit-wide Digital RAM (data bits = 64, addr bits = 7 → 1024 bytes); keep the RAM component at the top level so you can inspect it while simulating.
Registers hold byte addresses, but the RAM is indexed by doubleword addresses. Shift the ALU's target address right by 3 (take bits [9:3]) with a splitter to get the 7-bit RAM A input.
The low address bits are selectors, not part of the RAM address: bit 2 chooses the 32-bit word (lower vs. upper) within a cell, and bits [1:0] choose the byte within a word.
ld/sd come almost for free with a 64-bit RAM: ALU → splitter → RAM, RD1 → Din, and route Dout back through an M2R/WDsel MUX to the register file.
lw adds load logic after the RAM: split Dout into two 32-bit halves, pick one with bit 2, sign-extend 32 → 64, then select width with the MSZ MUX.
sw adds store logic before the RAM and is a read-modify-write: assert both str and ld, preserve the untouched half, merge in RD1[31:0], choose with bit 2, and feed the MSZ store MUX into Din.
New control lines MLD, MST, MSZ, and M2R extend the decoder spreadsheet; the MSZ encoding (00/10/11 for byte/word/dword) follows the RISC-V funct3 bits to keep decode logic simple.
lb/sb follow the same pattern at byte granularity, reusing the selector hierarchy (bit 2 for word, bits [1:0] for byte) and the same MUX/sign-extender structures.

Lab: Data Memory — LW and SW¶

Overview¶

Learning Objectives¶

Prerequisites¶

1. Where Data Memory Fits¶

2. The Digital RAM Component¶

3. Byte Addresses vs. Doubleword Addresses¶

The three address "views"¶

Selector bits, color-coded as in the notes¶

4. Step 1 — Implementing ld / sd¶

Datapath¶

Write-back MUX (M2R)¶

Worked example: sd then ld on the stack¶

5. Step 2 — Load Word (lw): Load Logic After the RAM¶

The four sub-steps¶

Worked example: lw selecting the upper word¶

6. Step 3 — Store Word (sw): Store Logic Before the RAM¶

The read-modify-write recipe¶

Worked example: sw into the lower half¶

7. The New Control Lines¶

8. Putting It All Together (Top-Level View)¶

9. Deriving lb / sb¶

Key Concepts¶

Practice Problems¶

Problem 1: Byte → Doubleword address¶

Problem 2: Which word does lw select?¶

Problem 3: Store word read-modify-write¶

Problem 4: Why both str and ld for sw?¶

Problem 5: Control-line table¶

Problem 6: Spot the bug¶

Further Reading¶

Summary¶

4. Step 1 — Implementing `ld` / `sd`¶

Write-back MUX (`M2R`)¶

Worked example: `sd` then `ld` on the stack¶

5. Step 2 — Load Word (`lw`): Load Logic After the RAM¶

Worked example: `lw` selecting the upper word¶

6. Step 3 — Store Word (`sw`): Store Logic Before the RAM¶

Worked example: `sw` into the lower half¶

9. Deriving `lb` / `sb`¶

Problem 2: Which word does `lw` select?¶

Problem 4: Why both `str` and `ld` for `sw`?¶