Lab: Data Memory — LW and SW¶
Overview¶
This hands-on lab session adds data memory to the single-cycle RISC-V processor we have been building across Lab 10 and Project 6. We use a Digital RAM component configured with 64-bit words so that ld/sd (load/store doubleword) work almost for free. We then wrap the RAM with load logic (after the RAM) and store logic (before the RAM) so the same 64-bit memory can also service lw/sw (word) and, by the same pattern, lb/sb (byte). The central idea is address arithmetic: the ALU produces a 64-bit byte address, but the RAM is indexed by doubleword address, so we shift right by 3, then use the low address bits to pick which sub-word inside a 64-bit cell we are touching. This page walks through the datapath, the memory-layout intuition, the control lines (MLD, MST, MSZ, M2R), and the step-by-step circuit construction with worked examples and common mistakes.
Learning Objectives¶
- Configure the Digital RAM (Separated Ports) component for use as processor data memory and state its capacity.
- Convert a 64-bit byte address (from the ALU) into the doubleword address the RAM expects.
- Identify which low address bits select a word vs. a byte within a 64-bit memory cell.
- Wire the
ld/sddatapath: ALU → address splitter → RAM, and RAMDout→ register file write-back MUX. - Build the load logic for
lw: split the 64-bitDoutinto two 32-bit halves, select with address bit 2, and sign-extend to 64 bits. - Build the store logic for
sw: read-modify-write a 64-bit cell so only the targeted 32-bit half is updated. - Add and set the new control lines (
MLD,MST,MSZ,M2R) in the instruction decoder spreadsheet. - Recognize and avoid common wiring/control mistakes (wrong shift amount, misaligned selector bit, forgetting
str+ldtogether for stores).
Prerequisites¶
- A working Project 6 processor through JAL/JALR and conditional branches.
- RISC-V instruction formats, especially I-type (loads) and S-type (stores) (Project 4 / Lab 03).
- Byte addressing, words vs. doublewords, little-endian layout (Project 3).
- Sign extension, shifting, and masking (Project 3, Lab 09).
- Multiplexers, splitters, mergers, and the RAM component in Digital (Lab 09, Project 5).
- The register-file write-back path and
WDselMUX from Lab 10 / Project 6.
1. Where Data Memory Fits¶
So far our processor can fetch instructions, decode them, read/write registers, run the ALU, and change control flow with jumps and branches. The last missing capability is touching memory: loading values from and storing values to RAM. Our test programs use this for the stack (saving ra and locals), but the same memory could back a heap.
RISC-V memory instructions come in matched load/store pairs, organized by data size:
| Mnemonic | Meaning | Size | Format | Action |
|---|---|---|---|---|
ld |
load doubleword | 64-bit | I-type | rd = mem[rs1 + imm] |
sd |
store doubleword | 64-bit | S-type | mem[rs1 + imm] = rs2 |
lw |
load word | 32-bit | I-type | rd = sext(mem32[rs1 + imm]) |
sw |
store word | 32-bit | S-type | mem32[rs1 + imm] = rs2[31:0] |
lb |
load byte | 8-bit | I-type | rd = sext(mem8[rs1 + imm]) |
sb |
store byte | 8-bit | S-type | mem8[rs1 + imm] = rs2[7:0] |
Lab strategy (incremental): get ld/sd working first with a plain 64-bit RAM, then add the wrapper logic for lw/sw. Once lw/sw work, lb/sb follow the exact same pattern with a narrower slice. Do not try to build everything at once.
flowchart LR
A["Register File<br/>RD0 = base addr"] --> B["ALU<br/>TA = base + imm"]
I["ImmDecoder<br/>imm-i / imm-s"] --> B
B --> C["Addr splitter<br/>byte to DW addr"]
C --> D["RAM<br/>64-bit cells"]
R1["RD1 = store value"] --> D
D --> E["Load logic<br/>select + sign-extend"]
E --> F["Write-back MUX<br/>M2R / WDsel"]
F --> A
style D fill:#f9f,stroke:#333,stroke-width:2px
style C fill:#bbf,stroke:#333,stroke-width:2px
2. The Digital RAM Component¶
We use Digital's RAM, Separated Ports component. "Separated ports" means it has distinct inputs for the address, the data to write (Din), and a separate data output (Dout) — exactly what a processor wants.
Configure it as:
| Setting | Value | Why |
|---|---|---|
| Data bits | 64 |
One cell holds a full RISC-V doubleword, so ld/sd are trivial. |
| Addr bits | 7 |
2^7 = 128 cells × 8 bytes = 1024 bytes of data memory. |
The component's pins, as drawn in the lab notes:
+-----------------------+
A --/-->| A (addr) Dout |--/--> 64 bits out
7 | | 64
| |
Din --/--> | Din RAM |
64 | |
MST -----> | str |
| |
CLK -----> | clk |
| |
MLD -----> | ld |
+-----------------------+
data bits: 64
addr bits: 7
| Pin | Driven by | Meaning |
|---|---|---|
A |
address splitter output | doubleword address (7 bits) of the cell to access |
Din |
store logic | 64-bit value to write |
Dout |
(output) | 64-bit value read from cell A |
str |
MST control line |
when 1, write Din into cell A on the clock edge |
clk |
processor CLK |
synchronous write timing |
ld |
MLD control line |
when 1, drive Dout with the contents of cell A |
Keep the RAM at the top level. A critical lab tip: even though you will wrap the RAM with extra logic, leave the RAM component itself at the top level of your processor circuit. That is the only way Digital lets you open the RAM during simulation to inspect (and pre-load) its contents while debugging. Put the load logic after it and the store logic before it, but do not bury the RAM inside a sub-circuit.
3. Byte Addresses vs. Doubleword Addresses¶
This is the single most important idea in the lab, and it caused the most confusion in class.
Registers always hold byte addresses. When a program computes sp + 8, that is a byte address. But our RAM is an array of 64-bit (8-byte) cells, and its A input selects a cell, not a byte. So we must convert.
A byte address divided by 8 gives the doubleword (DW) address:
Dividing by 8 just drops the low 3 bits. In hardware we do not need a divider — we use a splitter to take bits [9:3] of the 64-bit ALU output and feed those 7 bits to the RAM A input. The low 3 bits [2:0] are the offset inside the cell and are not sent to the RAM address; instead they tell us which sub-word we are touching.
byte addr bit: ... 9 8 7 6 5 4 3 | 2 | 1 0
\________ DW addr ______/ |word|byte|
(-> RAM A, 7 bits) |sel | sel |
The notes drew this as the ALU output TA (64 bits, a byte address) entering a splitter labeled 3-9, whose 7-bit output drives the RAM A input.
+-----+
RD0 (rs1) ----->| A |
base addr | ALU | TA +--------+ A
| B |--/--->| 3 - 9 |--/--> (RAM addr)
imm-i/imm-s --->+-----+ 64 +--------+ 7
byte splitter DW addr
addr
The three address "views"¶
The class drew the same memory three ways at once. The picture below mirrors that "memory layout" diagram: the same storage seen as raw bytes, as 32-bit words, and as 64-bit doublewords.
byte addr word addr dword addr
---------- --------- ----------
...
16 --* (word 4) (dword 2)
15
14 } word 3 ----,
13 } dword 1
12 ----------'
11
10 } word 2 ---,
9 } dword 1
8 -----------'
7
6 } word 1 ---,
5 } dword 0
4 -----------'
3
2 } word 0 ---,
1 } dword 0
0 -----------'
Reading the relationships off the diagram:
| Address kind | Conversion from byte address | Low bits used as offset |
|---|---|---|
| byte | (itself) | — |
| word | byte >> 2 |
bits [1:0] = byte within word |
| doubleword | byte >> 3 |
bits [2:0] = byte within dword |
Selector bits, color-coded as in the notes¶
The handwritten layout used colored boxes to show how the low address bits act as selectors:
- Word selector (green) = byte address bit 2. Within one 64-bit cell there are two 32-bit words: bit 2 = 0 selects the lower word (bytes 0–3), bit 2 = 1 selects the upper word (bytes 4–7).
- Byte selector (yellow) = byte address bits [1:0]. Within one 32-bit word there are four bytes; bits
[1:0]pick which.
64-bit cell = [ byte7 byte6 byte5 byte4 | byte3 byte2 byte1 byte0 ]
\________ upper word _____/ \_______ lower word ____/
(bit 2 = 1) (bit 2 = 0)
\__ b[1:0] pick byte __/ \__ b[1:0] pick byte __/
Common mistake. Students shifted by 2 (word divide) instead of 3 (doubleword divide) when feeding the RAM
Ainput. Because the RAM holds 64-bit cells, the address must be a doubleword address — shift right by 3, take bits[9:3].
4. Step 1 — Implementing ld / sd¶
Start with the easy case: 64-bit loads and stores. No sub-word logic is needed because the RAM cell is a doubleword.
Datapath¶
RD0 (rs1) --+
base addr | +-----+
+-->| A | TA +-------+ A +---------+ Dout +-----------+
| ALU |--/--->| 3 - 9 |--/---->| RAM |--/--->| WDsel MUX |--> RegFile WD
imm-i ------->| B | 64 +-------+ 7 | 64-bit | 64 +-----------+
(ld, I-type) +-----+ | cells |
imm-s -------> (for sd, B = imm-s) | |
RD1 ----> | Din |
(sd store val) | |
MST ----> | str |
CLK ----> | clk |
MLD ----> | ld |
+---------+
Step-by-step:
- Compute the target address in the ALU.
TA = base + imm. Forldthe offset is the I-type immediate; forsdit is the S-type immediate. Both reduce to "base register + signed offset", an ALU add. - Convert byte → DW address. Route
TAthrough a splitter taking bits[9:3](7 bits) to the RAMAinput. - For
sd: connectRD1(the value ofrs2) toDin, and assertMST = 1(str). The write happens on the clock edge. - For
ld: assertMLD = 1(ld) soDoutreflects cellA. RouteDout(64 bits) back to the register file write data through the write-back MUX.
Write-back MUX (M2R)¶
Loads must write the memory output into the register file, not the ALU result. The notes show two equivalent options:
- Expand the existing
WDselMUX to add a memory input, or - Add a new 2-input MUX (
M2R, "memory to register") that selects between the ALU result and RAMDout; its output feeds the existingWDselMUX.
flowchart LR
ALU["ALU result"] --> M2R{"M2R MUX"}
RAM["RAM Dout"] --> M2R
M2R --> WD["WDsel MUX"]
PC4["PC + 4"] --> WD
WD --> RF["RegFile WD"]
style M2R fill:#bbf,stroke:#333,stroke-width:2px
Worked example: sd then ld on the stack¶
main:
li sp, 1024 # initialize stack pointer (byte address)
li t0, 0xABCD # value to store
sd t0, -8(sp) # mem[sp-8] = t0
ld t1, -8(sp) # t1 = mem[sp-8] -> t1 == 0xABCD
unimp
Trace of the sd t0, -8(sp):
| Step | Value |
|---|---|
base = sp |
1024 (byte addr) |
| imm-s | -8 |
TA = 1024 + (-8) |
1016 (byte addr) |
DW addr = 1016 >> 3 |
127 → RAM A = 0b1111111 |
Din = RD1 = t0 |
0xABCD |
MST |
1 (write on clock edge) |
The matching ld recomputes TA = 1016, the same DW addr 127, asserts MLD, and Dout = 0xABCD flows back through M2R/WDsel into t1.
5. Step 2 — Load Word (lw): Load Logic After the RAM¶
For lw the RAM still hands us a full 64-bit cell, but the program wants only 32 bits, sign-extended back to 64. The extra circuitry sits after the RAM Dout and is gated by the memory-size control line MSZ.
The four sub-steps¶
- Read the 64-bit cell. Same as
ld: computeTA, shift to DW addr, assertMLD. (lwtargets are 4-byte aligned, i.e. byte addresses that are multiples of 4.) - Split
Doutinto two 32-bit halves. Use a splitter: bits[31:0]= lower wordW0, bits[63:32]= upper wordW1. - Select the right half with byte-address bit 2. Feed
W0andW1into a 2-input MUX whose selector isTAbit 2 (the green "word selector", written2-2in the notes). Bit 2 = 0 → lower word; bit 2 = 1 → upper word. - Sign-extend 32 → 64. The chosen 32-bit word goes through a sign extender to a 64-bit value. This is the
lwcandidate.
Finally, an outer MSZ MUX selects which width to write back:
MSZ = 0b00→ thelbvalue (byte, sign-extended)MSZ = 0b10→ thelwvalue (word, sign-extended)MSZ = 0b11→ theldvalue (the raw 64-bitDout)
This ordering matches the RISC-V funct3 encoding so the control bits fall out naturally.
+-----+ Dout (64)
RAM ---------------->| |----+------------------------------+
+-----+ | |
| split | (ld path: full 64)
v v
[63:32]=W1 [31:0]=W0 +---------------+
| \ / | |
bit 2 | \ / | MSZ MUX |
(word | \ / | 00 = lb val |--> to M2R / WDsel
select) | +------+ 32 +------+ | 10 = lw val |
+-->| WSMUX|--/--->| Sign |->|11 = ld(64) |
+------+ | Ext | | |
+------+ +---------------+
64
flowchart TD
RAM["RAM Dout (64)"] --> SPLIT["Splitter"]
SPLIT --> W0["W0 = Dout[31:0]"]
SPLIT --> W1["W1 = Dout[63:32]"]
W0 --> WS{"Word-select MUX<br/>(sel = TA bit 2)"}
W1 --> WS
WS --> SX["Sign-extend 32 to 64"]
SX --> MSZ{"MSZ MUX"}
RAM --> MSZ
MSZ --> OUT["to M2R / WDsel"]
style WS fill:#bbf,stroke:#333,stroke-width:2px
style SX fill:#bfb,stroke:#333,stroke-width:2px
style MSZ fill:#f9f,stroke:#333,stroke-width:2px
Why sign-extend?
lwloads a signed 32-bit value into a 64-bit register. If the word's bit 31 is 1 (negative), the upper 32 bits of the register must all be 1. The sign extender copies bit 31 into bits[63:32]. (lwu, an unsigned variant, would zero-extend instead — not required for this lab.)
Worked example: lw selecting the upper word¶
Suppose cell at DW addr 0 contains 0x_FFFF8000_00000007 (upper word 0xFFFF8000, lower word 0x00000007).
| Instruction | TA (byte) |
bit 2 | selected word | sign-extended result |
|---|---|---|---|---|
lw rd, 0(x0) |
0 |
0 |
0x00000007 (lower) |
0x0000000000000007 |
lw rd, 4(x0) |
4 |
1 |
0xFFFF8000 (upper) |
0xFFFFFFFFFFFF8000 |
The second case shows the sign extension in action: bit 31 of 0xFFFF8000 is 1, so the upper half fills with Fs.
6. Step 3 — Store Word (sw): Store Logic Before the RAM¶
Storing a word is harder than loading one. The RAM can only write a whole 64-bit cell, but sw must change only 32 bits and leave the other 32 untouched. The solution is read-modify-write within a single clock cycle.
Key control trick. For
sw(andsb), set both the RAMstrandldlines to 1. Withld = 1the RAM presents the current cell contents onDout(so we can read the half we must preserve), and withstr = 1it writes our newly assembledDinon the same clock edge.
The read-modify-write recipe¶
- Read the current cell
D64curfromDout(becauseMLD/ldis asserted). - Extract the new word
Wnew= bits[31:0]ofRD1(the value ofrs2). - Split
D64curinto its two halvesW0 = D64cur[31:0]andW1 = D64cur[63:32]. - Build two candidate 64-bit values with mergers:
- replace lower word:
merge(Wnew, W1)→[ W1 | Wnew ] - replace upper word:
merge(W0, Wnew)→[ Wnew | W0 ] - Choose with the word-index bit (TA bit 2). bit 2 = 0 → "replace lower"; bit 2 = 1 → "replace upper".
MSZstore MUX picks between thesb,sw, andsdassembled values, then drives the RAMDin.
RD1 (store value) --split--> Wnew = RD1[31:0]
|
RAM Dout = D64cur --split--> W0=[31:0] W1=[63:32]
| | |
merge(Wnew, W1) = [ W1 | Wnew ] --+--,
merge(W0, Wnew) = [ Wnew | W0 ] ----+--> DWn MUX --+
(sel = TA bit 2) |
v
+-------------+
sd value (full RD1 64) ----------------------------->| MSZ MUX |--> RAM Din
sb value (assembled) ----------------------------->| 00 sb /10 sw|
| 11 sd |
+-------------+
flowchart TD
RD1["RD1 (store value)"] --> WN["Wnew = RD1[31:0]"]
DOUT["RAM Dout = D64cur"] --> S2["Splitter"]
S2 --> W0["W0 = cur[31:0]"]
S2 --> W1["W1 = cur[63:32]"]
WN --> M1["merge(Wnew, W1)"]
W1 --> M1
W0 --> M2["merge(W0, Wnew)"]
WN --> M2
M1 --> DW{"DWn MUX<br/>(sel = TA bit 2)"}
M2 --> DW
DW --> MSZ{"MSZ store MUX"}
RD1 --> MSZ
MSZ --> DIN["RAM Din"]
style DW fill:#bbf,stroke:#333,stroke-width:2px
style MSZ fill:#f9f,stroke:#333,stroke-width:2px
Worked example: sw into the lower half¶
Cell at DW addr 0 currently holds 0x_DEADBEEF_11112222. Execute sw rs2, 0(x0) where rs2 = 0x_0000_0000_0000_0099.
| Step | Value |
|---|---|
TA (byte) |
0 → bit 2 = 0 → replace lower word |
D64cur |
0xDEADBEEF11112222 |
W1 = cur[63:32] (preserve) |
0xDEADBEEF |
Wnew = rs2[31:0] |
0x00000099 |
assembled = merge(Wnew, W1) |
0xDEADBEEF00000099 |
MSZ selects sw value → Din |
0xDEADBEEF00000099 |
Only the lower 32 bits changed; the upper word 0xDEADBEEF was read back from the cell and re-merged, exactly as intended. Had TA been 4, bit 2 = 1 would have chosen merge(W0, Wnew) = 0x0000009911112222, updating only the upper word.
Common mistake. Forgetting to assert
ldduring a store. Ifld = 0,Doutis not driven with the current contents, so the "preserve" half is garbage and you clobber the neighboring word. Forsw/sbyou need bothstr = 1andld = 1.
7. The New Control Lines¶
Adding memory means extending the instruction decoder spreadsheet with new control outputs. As in earlier labs, set the new outputs to their inactive value (0) for every existing instruction so behavior does not change.
| Control line | Width | Purpose | Set when... |
|---|---|---|---|
MLD |
1 | RAM ld (drive Dout) |
any load (ld/lw/lb) and any sub-word store (sw/sb) |
MST |
1 | RAM str (write on edge) |
any store (sd/sw/sb) |
MSZ |
2 | memory size select (b/w/d) | loads and stores; 00=byte, 10=word, 11=dword |
M2R |
1 | write-back source = memory | any load |
The MSZ encoding follows the RISC-V funct3 low bits so the decoder needs minimal logic:
| Op | size | MSZ |
|---|---|---|
lb / sb |
byte | 0b00 |
lw / sw |
word | 0b10 |
ld / sd |
dword | 0b11 |
Example decoder rows (control outputs shown; x = don't-care input):
| INUM | Instr | Mnem | Format | opcode | funct3 | M2R |
MLD |
MST |
MSZ |
|---|---|---|---|---|---|---|---|---|---|
| ... | ld |
MLDD | I | 0000011 | 011 | 1 | 1 | 0 | 11 |
| ... | lw |
MLDW | I | 0000011 | 010 | 1 | 1 | 0 | 10 |
| ... | sd |
MSTD | S | 0100011 | 011 | 0 | 0 | 1 | 11 |
| ... | sw |
MSTW | S | 0100011 | 010 | 0 | 1 | 1 | 10 |
Note
swsetsMLD = 1andMST = 1together (read-modify-write), whilesdonly needsMST = 1because it overwrites the whole cell.
8. Putting It All Together (Top-Level View)¶
The lab notes sketched the top level as three blocks around the RAM: a Store block before Din, the RAM itself, and a Load block after Dout. The orange wiring is the store path (RD1 → store logic → Din), and the green wiring is the load path (Dout → load logic → register file).
flowchart LR
subgraph ADDR["Address path"]
ALU["ALU TA (64, byte addr)"] --> SP["splitter [9:3]"]
end
SP --> A["RAM A (7, DW addr)"]
subgraph STORE["Store logic (before RAM)"]
RD1["RD1 = rs2"] --> SL["read-modify-write<br/>+ MSZ store MUX"]
end
DOUT2["RAM Dout (current)"] --> SL
SL --> DIN["RAM Din"]
A --> RAM["RAM 64-bit cells<br/>str=MST ld=MLD"]
DIN --> RAM
RAM --> DOUT["RAM Dout"]
RAM --> DOUT2
subgraph LOAD["Load logic (after RAM)"]
DOUT --> LL["word select (bit 2)<br/>+ sign-extend + MSZ"]
end
LL --> M2R{"M2R MUX"}
ALU --> M2R
M2R --> WD["WDsel MUX"]
WD --> RF["Register File WD"]
style RAM fill:#f9f,stroke:#333,stroke-width:2px
style SL fill:#fc9,stroke:#333,stroke-width:2px
style LL fill:#bfb,stroke:#333,stroke-width:2px
Build order recap:
- Add the RAM (64-bit data, 7 addr bits) at the top level; wire
CLK. - Add the address splitter (
TA[9:3]→A). - Wire
ld/sd:RD1→Din,MST→str,MLD→ld; add theM2Rwrite-back MUX. - Test
ld/sdend-to-end with a stack program before going further. - Add load logic (split, word-select on bit 2, sign-extend,
MSZMUX) forlw. - Add store logic (read-modify-write,
DWnMUX on bit 2,MSZMUX) forsw; rememberstrandldboth = 1. - Derive
lb/sbby adding a byte-level slice (bits[1:0]select the byte) into the same MUX structures.
9. Deriving lb / sb¶
The byte instructions reuse everything. The only changes:
- Load (
lb): instead of splitting into two 32-bit words, split the selected region down to a single byte using byte-address bits[1:0](within a word) plus bit 2 (which word). The chosen 8-bit value goes through an 8→64 sign extender, then into theMSZMUX at0b00. - Store (
sb): read-modify-write at byte granularity — preserve the other 7 bytes of the cell, replace just the target byte selected by bits[2:0], route through theMSZstore MUX at0b00. As withsw, assert bothstrandld.
The selector hierarchy is exactly the color-coded picture from Section 3: bit 2 picks the word, bits [1:0] pick the byte within it.
64-bit cell selected by A (= byte >> 3)
|
+-- bit 2 picks 32-bit word (lw / sw)
|
+-- bits [1:0] pick byte within word (lb / sb)
Key Concepts¶
| Concept | Definition | Example |
|---|---|---|
| Byte address | An address that counts individual bytes; what registers hold | sp + 8 is a byte address |
| Doubleword (DW) address | A byte address divided by 8; indexes a 64-bit RAM cell | 1016 >> 3 = 127 |
| Address splitter | Splitter taking TA[9:3] to produce the 7-bit RAM A input |
1016 (byte) → 127 (DW) |
| Word selector | Byte-address bit 2; picks lower vs. upper 32-bit word in a cell | bit 2 = 1 → upper word |
| Byte selector | Byte-address bits [1:0]; picks one of 4 bytes in a word | [1:0] = 0b10 → byte 2 |
| Sign extension | Replicating bit 31 (or bit 7) into the upper bits for signed loads | 0xFFFF8000 → 0xFFFFFFFFFFFF8000 |
| Read-modify-write | Read a cell, change part of it, write it back in one cycle | sw updating one 32-bit half |
MSZ |
Memory-size control: 00=byte, 10=word, 11=dword |
lw uses MSZ = 0b10 |
M2R |
Mux control selecting RAM output (not ALU) as write-back data | set for every load |
MLD / MST |
RAM ld / str enables; sub-word stores assert both |
sw: MLD=1, MST=1 |
Practice Problems¶
Problem 1: Byte → Doubleword address¶
A program executes ld t0, 24(s0) with s0 = 1000. What byte address does the ALU compute, and what 7-bit value appears on the RAM A input?
Click to reveal solution
Wait — `128` does not fit in 7 bits (`0..127`). With only 7 address bits the RAM has 128 cells (DW addresses `0..127`), so `128` is out of range. In practice the splitter takes `TA[9:3]`, and bit 10 (worth `1024` in byte terms = DW `128`) is dropped, wrapping to `A = 0`. **Takeaway:** 7 address bits give exactly 1024 bytes of memory; a stack initialized to `1024` should grow *downward* (`addi sp, sp, -N`) so accesses land in range. For an in-range example, `ld t0, 16(s0)` with `s0 = 1000` gives `TA = 1016`, `DW = 1016 >> 3 = 127`, so `A = 0b1111111`.Problem 2: Which word does lw select?¶
A cell at DW address 3 holds 0x_CAFEF00D_DEADBEEF. The processor executes lw rd, 4(x0)... then lw rd, 24(x0). For each, give the byte address, the value of bit 2, the selected 32-bit word, and the 64-bit register result.
Click to reveal solution
`lw rd, 4(x0)`: `TA = 4`, `DW = 4>>3 = 0` (cell 0, not cell 3) — different cell. bit 2 of `4` = 1 → would select the **upper** word of *cell 0*. `lw rd, 24(x0)`: `TA = 24`, `DW = 24>>3 = 3` → **cell 3**. bit 2 of `24` (`0b11000`) = `0` → select the **lower** word = `0xDEADBEEF`. If instead we did `lw rd, 28(x0)`: `DW = 28>>3 = 3`, bit 2 of `28` (`0b11100`) = 1 → upper word `0xCAFEF00D`, bit 31 = 1 → result `0xFFFFFFFF_CAFEF00D`.Problem 3: Store word read-modify-write¶
Cell at DW address 2 holds 0x_00000000_FFFFFFFF. Execute sw rs2, 16(x0) with rs2 = 0x_0000_0000_8000_0001. Show the assembled Din.
Click to reveal solution
TA = 16, DW = 16 >> 3 = 2 (cell 2). bit 2 of 16 (0b10000) = 0 -> replace LOWER word.
D64cur = 0x00000000_FFFFFFFF
W1 = cur[63:32] = 0x00000000 (preserve)
W0 = cur[31:0] = 0xFFFFFFFF (will be replaced)
Wnew = rs2[31:0] = 0x80000001
assembled = merge(Wnew, W1) = [ W1 | Wnew ] = 0x00000000_80000001
Din = 0x0000000080000001 (only lower word changed)
Problem 4: Why both str and ld for sw?¶
Explain in one or two sentences why sw must assert the RAM ld line in addition to str, while sd only needs str.
Click to reveal solution
`sw` only changes 32 of the 64 bits in a cell, so it must **read** the current cell contents (`ld = 1` drives `Dout`) to preserve the *other* 32 bits, then **write** the re-assembled 64-bit value (`str = 1`). `sd` overwrites the entire 64-bit cell, so there is nothing to preserve and no read is needed — `str` alone suffices.Problem 5: Control-line table¶
Fill in M2R, MLD, MST, and MSZ for lb, sb, and lw.
Click to reveal solution
| Instr | `M2R` | `MLD` | `MST` | `MSZ` | |-------|-------|-------|-------|-------| | `lb` | 1 | 1 | 0 | 00 | | `sb` | 0 | 1 | 1 | 00 | | `lw` | 1 | 1 | 0 | 10 | `lb` is a load → write memory back to register (`M2R=1`), read enabled (`MLD=1`), no write (`MST=0`), byte size (`00`). `sb` is a sub-word store → read-modify-write so **both** `MLD=1` and `MST=1`, not a register write-back (`M2R=0`). `lw` mirrors `lb` but with word size (`10`).Problem 6: Spot the bug¶
A student's lw returns correct values for offsets 0, 8, 16, ... but wrong (off-by-one-word) values for offsets 4, 12, 20, .... The address splitter is correct. What is the most likely wiring mistake?
Click to reveal solution
The **word-select MUX** is using the wrong selector bit (or has its inputs swapped). The lower/upper-word choice must come from **byte-address bit 2** (`TA[2]`). Offsets that are multiples of 8 all have bit 2 = 0 (lower word) and happen to work; offsets `4, 12, 20` have bit 2 = 1 and need the **upper** word. If the MUX selector is wired to a different bit (e.g., bit 3) or the two 32-bit inputs are swapped, exactly these cases break. Fix: drive the MUX select from `TA[2]`, with input 0 = `Dout[31:0]` (lower) and input 1 = `Dout[63:32]` (upper).Further Reading¶
- Course guide: Processor Design Part 3 — Data Memory (see the course Guides page)
- RISC-V Unprivileged ISA Specification — load/store instruction encodings
- Digital simulator documentation — RAM components, splitters, and mergers
- Patterson & Hennessy, Computer Organization and Design (RISC-V Edition) — Chapter 4, the single-cycle datapath and memory access stage
- Project 6 specification: /assignments/project06/
- Source notes (PDF): /notes/CS315-01 2025-11-12 Lab Data Memory LW_SW.pdf
Summary¶
-
Data memory is a 64-bit-wide Digital RAM (data bits = 64, addr bits = 7 → 1024 bytes); keep the RAM component at the top level so you can inspect it while simulating.
-
Registers hold byte addresses, but the RAM is indexed by doubleword addresses. Shift the ALU's target address right by 3 (take bits
[9:3]) with a splitter to get the 7-bit RAMAinput. -
The low address bits are selectors, not part of the RAM address: bit 2 chooses the 32-bit word (lower vs. upper) within a cell, and bits
[1:0]choose the byte within a word. -
ld/sdcome almost for free with a 64-bit RAM: ALU → splitter → RAM,RD1→Din, and routeDoutback through anM2R/WDselMUX to the register file. -
lwadds load logic after the RAM: splitDoutinto two 32-bit halves, pick one with bit 2, sign-extend 32 → 64, then select width with theMSZMUX. -
swadds store logic before the RAM and is a read-modify-write: assert bothstrandld, preserve the untouched half, merge inRD1[31:0], choose with bit 2, and feed theMSZstore MUX intoDin. -
New control lines
MLD,MST,MSZ, andM2Rextend the decoder spreadsheet; theMSZencoding (00/10/11for byte/word/dword) follows the RISC-Vfunct3bits to keep decode logic simple. -
lb/sbfollow the same pattern at byte granularity, reusing the selector hierarchy (bit 2 for word, bits[1:0]for byte) and the same MUX/sign-extender structures.