struct foo_st { char c; int a; int64_t b; };
int main(void) {
printf("c at %zu\n", offsetof(struct foo_st, c)); // 0
printf("a at %zu\n", offsetof(struct foo_st, a)); // 4
printf("b at %zu\n", offsetof(struct foo_st, b)); // 8
printf("size = %zu\n", sizeof(struct foo_st)); // 16
}
```
Always verify struct offsets with offsetof before writing assembly. Wrong offsets = wrong data loaded silently.
---
## All Pointers Are 8 Bytes
On RV64, every pointer is a 64-bit value — **8 bytes** — regardless of what it points to.
```c
int *ip; // 8 bytes
char *cp; // 8 bytes
struct foo_st *fp; // 8 bytes
```
- Always load/store pointers with `ld` / `sd`
- A `next` pointer in a linked list node is **always** 8 bytes
---
## What Is a Linked List?
An **array** stores elements contiguously. A **linked list** stores each element in a separate node connected by pointers.
flowchart LR
head(["head"]) --> N0["x | next"]
N0 --> N1["x | next"]
N1 --> N2["x | next"]
N2 --> NULL["NULL (0)"]
- `head` points to the first node
- Each `next` pointer points to the following node
- Last node's `next` = `NULL` (integer `0`) — marks the end
- Can only traverse **forward**
---
## Singly vs Doubly Linked Lists
flowchart LR
NA["NULL"] -.-> A["x | prev/next"]
A -- next --> B["x | prev/next"]
B -- prev --> A
B -- next --> C["x | prev/next"]
C -- prev --> B
C -- next --> NB["NULL"]
| Operation | Singly | Doubly |
|-----------|--------|--------|
| Forward traversal | O(n) | O(n) |
| Backward traversal | Not possible | O(n) |
| Remove given node | O(n) | O(1) |
| Memory per node | 1 pointer (8 B) | 2 pointers (16 B) |
---
## A Linked-List Node in C
```c
struct node_st {
struct node_st *next_p; // offset 0 (pointer, 8 bytes)
int value; // offset 8 (int, 4 bytes)
};
// total size: 16 bytes (4 bytes tail padding)
```
Putting `next_p` first is convenient:
- `ld next, (node)` — no offset needed to advance
- `lw val, 8(node)` — value at offset 8
---
## Building a List on the Stack
```c
struct node_st n0, n1, n2, n3;
n0.value = 11; n0.next_p = &n1;
n1.value = 22; n1.next_p = &n2;
n2.value = 33; n2.next_p = &n3;
n3.value = 44; n3.next_p = NULL;
struct node_st *head = &n0;
```
```text
+-------------+
| 44 NULL | n3 (value=44, next_p=NULL)
+-------------+
+--> | 33 .---> n3 | n2
| +-------------+
+--> | 22 .---> n2 | n1
| +-------------+
head --> | 11 .---> n1 | n0
+-------------+
```
---
## Traversal Loop in C
```c
int count(struct node_st *p) {
int n = 0;
while (p != NULL) {
n = n + 1;
p = p->next_p; // advance: load next pointer
}
return n;
}
```
Three parts every traversal loop needs:
1. **Termination test**: `p != NULL`
2. **Body**: process current node's fields
3. **Advance**: `p = p->next_p` — load the next pointer
---
## Count: RISC-V Assembly
```asm
# int count(struct node_st *p)
# a0 = p (pointer to current node)
# node_st layout: next_p @ 0, value @ 8
count:
li t0, 0 # t0 = n = 0
count_loop:
beq a0, zero, count_done # if p == NULL, stop
addi t0, t0, 1 # n = n + 1
ld a0, 0(a0) # p = p->next_p (advance)
j count_loop
count_done:
mv a0, t0 # return value in a0
ret
```
- `beq a0, zero` tests for `NULL` (NULL = integer 0 = `zero` register)
- `ld a0, 0(a0)` loads `next_p` at offset 0 and overwrites `a0`
---
## Load Sizes Must Match Field Types
In `count` and all list traversals:
| Field | C type | Size | Instruction |
|-------|--------|------|-------------|
| `next_p` | pointer | 8 bytes | `ld` |
| `value` | `int` | 4 bytes | `lw` |
Using lw to load a pointer, or ld to load an int, is a classic bug. The hardware loads whatever you say — it won't warn you.
---
## findmaxll: Maximum Value (Project 3)
```asm
# int findmaxll(struct node_st *p)
# next_p @ 0, value @ 8
findmaxll:
lw t1, 8(a0) # max = p->value (seed)
findmax_loop:
beq a0, zero, findmax_done
lw t2, 8(a0) # t2 = p->value
ble t2, t1, findmax_skip # if value <= max, skip
mv t1, t2 # max = value
findmax_skip:
ld a0, 0(a0) # p = p->next_p
j findmax_loop
findmax_done:
mv a0, t1 # return max
ret
```
`lw` for `value` (signed `int`), `ld` for `next_p` (pointer).
---
## Traversal Flowchart
flowchart TD
A["p = head"] --> B{"p == NULL?"}
B -- yes --> E["return max"]
B -- no --> C["lw t2, 8(a0) — read value"]
C --> D["update max if needed"]
D --> F["ld a0, 0(a0) — advance p"]
F --> B
---
## findmaxllp: Printing During Traversal
`findmaxllp` calls `printf` inside the loop — making it a **non-leaf function**.
Requirements beyond `findmaxll`:
- Build a **stack frame** and save `ra`
- Save current node pointer and running max in **callee-saved registers** (`s0`, `s1`, ...) so they survive the `printf` call
- Restore saved registers and free the frame before `ret`
The traversal logic is identical — only calling-convention bookkeeping differs.
---
## Non-Leaf Function Frame
```asm
findmaxllp:
addi sp, sp, -32
sd ra, 24(sp) # save return address
sd s0, 16(sp) # save s0 (will hold p)
sd s1, 8(sp) # save s1 (will hold max)
# ... traversal with printf calls using s0, s1 ...
ld ra, 24(sp)
ld s0, 16(sp)
ld s1, 8(sp)
addi sp, sp, 32
ret
```
Caller-saved registers (t0–t6, a0–a7) are clobbered by printf. Use callee-saved registers (s0–s11) to hold values across calls.
---
## Debugging with GDB
```bash
gdb ./findmaxll
(gdb) break findmaxll
(gdb) run 1 2 3 4 99 5
(gdb) info registers a0 # address of current node
(gdb) x/2gx $a0 # dump next_p (offset 0) and value (offset 8)
(gdb) stepi # single-step instructions
```
What to watch:
- `a0` jumps by **16 bytes** per iteration (padded `node_st` size) if nodes are in an array
- `x/2gx $a0` shows two 8-byte words: `next_p` then `value` (in low 4 bytes)
- When `ld a0, 0(a0)` brings in `0`, the next `beq` exits the loop
---
## Spot the Bug
Node layout: `value @ 0`, `next_p @ 8`
```asm
loop:
beq a0, zero, done
lw t0, 0(a0) # read value — OK
ld a0, 0(a0) # advance? <- BUG
j loop
```
**Problem**: `ld a0, 0(a0)` loads `value` (offset 0) as a pointer instead of `next_p` (offset 8).
```asm
ld a0, 8(a0) # correct: next_p is at offset 8
```
The hardware loads whatever bytes you specify — wrong offset = wrong data.
---
## Practice: Sum All Values
Translate to RISC-V (`next_p @ 0`, `value @ 8`):
```c
int sum(struct node_st *p) {
int total = 0;
while (p != NULL) {
total = total + p->value;
p = p->next_p;
}
return total;
}
```
Key choices: `lw` for `value` (signed `int`), `ld` for `next_p` (pointer), `beq a0, zero` for NULL test.
---
## Practice: Sum — Solution
```asm
# int sum(struct node_st *p)
# a0 = p ; next_p @ 0, value @ 8
sum:
li t0, 0 # total = 0
sum_loop:
beq a0, zero, sum_done # p == NULL -> stop
lw t1, 8(a0) # t1 = p->value (int -> lw)
add t0, t0, t1 # total += value
ld a0, 0(a0) # p = p->next_p (pointer -> ld)
j sum_loop
sum_done:
mv a0, t0 # return total
ret
```
---
## Key Concepts
| Concept | Key Point |
|---------|-----------|
| Load/store sizes | `ld/sd`=8B, `lw/sw`=4B, `lh/sh`=2B, `lb/sb`=1B |
| Sign extension | `lb`/`lh`/`lw` fill high bits from sign bit |
| Zero extension | `lbu`/`lhu`/`lwu` fill high bits with zeros |
| No unsigned stores | Stores copy low N bits; nothing to extend |
| Struct offset | Byte distance from struct start to field |
| Padding | Filler bytes for natural alignment |
| All pointers = 8B | Always `ld`/`sd`, never `lw`/`sw` |
| NULL terminator | Pointer value `0`; test with `beq reg, zero` |
| Traversal pattern | Test NULL → process fields → `ld a0, 0(a0)` |
---
## Summary
1. **Four load/store sizes** match C types: `ld/sd` (8B), `lw/sw` (4B), `lh/sh` (2B), `lb/sb` (1B)
2. **Signed loads sign-extend; unsigned loads (`u` suffix) zero-extend** — match the declared C type or you will read garbage
3. **Stores have no unsigned variant** — they just write the low N bits
4. **Struct fields use byte offsets**; the compiler inserts padding for alignment — always verify with `offsetof` or `gcc -S`
5. **All pointers are 8 bytes** on RV64 — always load/store with `ld`/`sd`
6. **Linked list traversal** = loop: test `beq a0, zero` for NULL, process node fields, advance with `ld a0, 0(a0)`
7. **Non-leaf traversal** (e.g., `findmaxllp`) requires saving `ra` and using callee-saved registers across `printf` calls