← Back to Course
# RISC-V Assembly Part 2: Arguments, Arrays, Conditionals, and Loops ## CS 315 Computer Architecture --- ## Learning Objectives - Classify instructions into **data processing**, **control**, and **memory** - Understand how arrays live in memory and how to address elements - Use `lw` with `offset(base)` syntax to read array elements - Pass scalars and array base addresses as function arguments - Translate `if`/`else` into conditional branches and jumps - Translate `for`/`while` loops into a guard branch and back-edge jump - Choose the correct branch instruction for a given comparison --- ## Three Categories of Instructions | Category | Purpose | Examples | |----------|---------|----------| | **Data processing** | Compute from registers | `add`, `addi`, `sub`, `mul`, `slli` | | **Control** | Change next instruction | `j`, `beq`, `bne`, `blt`, `bge`, `ret` | | **Memory** | Move data registers ↔ memory | `lw`, `sw`, `ld`, `sd` |
The processor computes only on
registers
. Data lives in
memory
. Memory instructions bridge the gap.
--- ## The Load → Compute → Store Rhythm
flowchart LR A["Memory"] -->|"load"| B["Registers"] B -->|"data processing"| B B -->|"store"| A
1. **Load** values from memory into registers 2. **Compute** a new value using data-processing instructions 3. **Store** result back to memory if needed 4. **Control** instructions decide whether to repeat or branch --- ## Arrays in Memory An `int arr[3] = {1, 2, 3}` occupies **12 contiguous bytes** (4 bytes per `int`). ```text Address Value Element base + 0 -> 1 arr[0] base + 4 -> 2 arr[1] base + 8 -> 3 arr[2] ```
Key idea:
An array argument is just its
base address
— a pointer to the first element. Adding a byte offset reaches any element.
--- ## Element Address Formula ```text address of arr[i] = base + (i * sizeof(int)) = base + (i * 4) ``` | Element | Index `i` | Byte offset | Address | |---------|-----------|-------------|---------| | `arr[0]` | 0 | 0 | `base + 0` | | `arr[1]` | 1 | 4 | `base + 4` | | `arr[2]` | 2 | 8 | `base + 8` | | `arr[3]` | 3 | 12 | `base + 12` | For `long` or pointer arrays the stride is **8** bytes instead of 4. --- ## Load Word: `lw` Reads a 32-bit value from memory into a register: ```text lw t0, (a0) # t0 = *a0 (dereference a0) lw t0, 0(a0) # t0 = arr[0] (base + 0) lw t1, 4(a0) # t1 = arr[1] (base + 4) lw t2, 8(a0) # t2 = arr[2] (base + 8) ``` Syntax: `lw dst, offset(base)` — effective address = `base + offset`
This is
register-indirect addressing
:
a0
holds an address, not the data itself.
--- ## Load vs Store Operand Order ```text # Load: DESTINATION register comes first lw t0, (a0) # t0 = memory[a0] (t0 receives) # Store: SOURCE register comes first sw t0, (a0) # memory[a0] = t0 (t0 provides) ```
Watch the operand order! Swapping
lw
and
sw
operands is a frequent bug.
--- ## Memory Instruction Sizes | Instruction | Bytes | C type | Operation | |-------------|-------|--------|-----------| | `lw rd, off(rs)` | 4 | `int` | load 32-bit word | | `sw rs2, off(rs1)` | 4 | `int` | store 32-bit word | | `ld rd, off(rs)` | 8 | `long`, pointer | load 64-bit doubleword | | `sd rs2, off(rs1)` | 8 | `long`, pointer | store 64-bit doubleword | For Lab 3 (`int` arrays), use **`lw`** and **`sw`**. --- ## Argument and Return Convention - **Arguments**: `a0`, `a1`, `a2`, ..., `a7` (in order) - **Return value**: `a0` ```text # int add3(int a, int b, int c) # a0 = a, a1 = b, a2 = c add3: add a0, a0, a1 # a0 = a + b add a0, a0, a2 # a0 = (a+b) + c ret # return value is in a0 ```
For Lab 3: use only
a
registers (args/return) and
t
registers (temporaries). No stack management needed for leaf functions.
--- ## Scalar vs Array Arguments **Three scalars** — each value arrives in its own register, no memory access: ```text # int add3(int a, int b, int c) — a0, a1, a2 add3: add a0, a0, a1 add a0, a0, a2 ret ``` **One array** — only the base address arrives; must `lw` each element: ```text # int sum3(int *arr) — a0 = base address sum3: lw t0, 0(a0) # t0 = arr[0] lw t1, 4(a0) # t1 = arr[1] lw t2, 8(a0) # t2 = arr[2] add t0, t0, t1 add a0, t0, t2 # return value in a0 ret ``` --- ## Indexed Array Access (Variable Index) When the index `i` is in a register (e.g., inside a loop): ```text # t2 = arr[i] where a0 = base, t0 = i # Option A: multiply li t1, 4 mul t1, t0, t1 # t1 = i * 4 add t1, a0, t1 # t1 = &arr[i] lw t2, (t1) # Option B: shift (preferred — faster) slli t1, t0, 2 # t1 = i * 4 (shift left by 2) add t1, a0, t1 # t1 = &arr[i] lw t2, (t1) ``` | `slli` shift | Multiplies by | Element size | |--------------|---------------|--------------| | `slli x, y, 1` | 2 | 2-byte `short` | | `slli x, y, 2` | 4 | 4-byte `int` | | `slli x, y, 3` | 8 | 8-byte `long`/pointer | --- ## Jumps and Labels A **label** names a code location. A **jump** rewrites the program counter. ```text main: add t0, zero, zero # t0 = 0 j next # unconditional jump -- skip two adds add ... # (skipped) add ... # (skipped) next: addi t0, zero, 1 # execution resumes here ``` - `j label` — unconditional: always transfers control - `b__ rs1, rs2, label` — conditional: transfers only if comparison holds --- ## Branch Instructions | Instruction | Branch taken when | Meaning | |-------------|-------------------|---------| | `beq rs1, rs2, label` | `rs1 == rs2` | equal | | `bne rs1, rs2, label` | `rs1 != rs2` | not equal | | `blt rs1, rs2, label` | `rs1 < rs2` | less than (signed) | | `bge rs1, rs2, label` | `rs1 >= rs2` | greater or equal (signed) | | `ble rs1, rs2, label` | `rs1 <= rs2` | less than or equal (signed) | | `bgt rs1, rs2, label` | `rs1 > rs2` | greater than (signed) |
blt
,
bge
,
ble
,
bgt
use
signed
comparison. Use
bltu
/
bgeu
for unsigned. For Lab 3 signed
int
data, use the signed variants.
--- ## Translating If/Else: The Key Trick **Branch on the OPPOSITE condition** to skip the then-block. ```c if (val > 0) { r = 1; } else { r = 0; } ``` ```text # a0 = val, t1 = r ble a0, zero, else # if val <= 0, go to else (opposite!) addi t1, zero, 1 # then-block: r = 1 j done # skip the else-block else: addi t1, zero, 0 # else-block: r = 0 done: ``` Structure: branch-on-false skips then → then ends with `j done` → else falls to `done:`. --- ## If/Else Control Flow
flowchart TD A["Evaluate condition (val > 0?)"] -->|"false: ble taken"| E["else block: r = 0"] A -->|"true: fall through"| T["then block: r = 1"] T --> J["j done"] J --> D["done:"] E --> D
--- ## Condition Inversion Table | C condition | Branch to `else` when false | |-------------|------------------------------| | `val > 0` | `ble a0, zero, else` | | `val >= 0` | `blt a0, zero, else` | | `a == b` | `bne a0, a1, else` | | `a != b` | `beq a0, a1, else` | | `a < b` | `bge a0, a1, else` | | `a <= b` | `bgt a0, a1, else` | --- ## If Without Else ```c if (x == 0) { y = 1; } ``` ```text # a0 = x, t0 = y bne a0, zero, done # if x != 0, skip body (opposite!) li t0, 1 # body: y = 1 done: ``` No `j` needed — when the body finishes it simply falls through to `done:`. --- ## Translating Loops A loop is an `if` whose body **jumps back up** to re-test the condition. ```c int loopsum(int n) { int i = 0, sum = 0; for (i = 0; i < n; i++) { sum = sum + i; } return sum; } ``` ```text loopsum: li t0, 0 # i = 0 li t1, 0 # sum = 0 loop: bge t0, a0, done # if i >= n, exit (opposite: loop while i < n) add t1, t1, t0 # sum = sum + i addi t0, t0, 1 # i++ j loop # back-edge: re-test condition done: mv a0, t1 # return sum ret ``` --- ## Anatomy of the Loop | `for` clause | C | Assembly | |--------------|---|----------| | Initialization | `i = 0` | `li t0, 0` (before `loop:`) | | Condition | `i < n` | `bge t0, a0, done` (inverted) | | Body | `sum += i` | `add t1, t1, t0` | | Update | `i++` | `addi t0, t0, 1` | | Repeat | (implicit) | `j loop` |
Guard branches on the
opposite
condition: loop
continues
while
i < n
, so assembly
exits
when
i >= n
.
--- ## Loop Control Flow
flowchart TD Init["i = 0, sum = 0"] --> Guard{"i < n?"} Guard -->|"yes: fall through"| Body["sum = sum + i"] Body --> Update["i = i + 1"] Update --> Guard Guard -->|"no: bge taken"| Done["return sum"]
--- ## Pseudo-Instructions Convenience mnemonics the assembler expands to real instructions: | Pseudo-instruction | Expands to | Meaning | |--------------------|------------|---------| | `li t0, 0` | `addi t0, zero, 0` | load immediate | | `mv a0, t1` | `addi a0, t1, 0` | copy register | | `j loop` | `jal zero, loop` | unconditional jump | | `ret` | `jalr zero, ra, 0` | return from function | These make code far more readable while still mapping to the base ISA. --- ## Tracing loopsum(4) by Hand | Iteration | `t0` (i) at guard | `i >= 4`? | `t1` (sum) after body | |-----------|-------------------|-----------|------------------------| | 1 | 0 | no | 0 | | 2 | 1 | no | 1 | | 3 | 2 | no | 3 | | 4 | 3 | no | 6 | | exit | 4 | yes → done | 6 (returned in `a0`) |
Verify with
gdb
: set breakpoint at
loopsum
, then use
stepi
and
info registers t0 t1 a0
to watch each step.
--- ## Putting It Together: findmax Combines array argument + loop + inner conditional: ```text findmax: # a0=arr, a1=n lw t1, (a0) # max = arr[0] li t0, 1 # i = 1 loop: bge t0, a1, done # exit if i >= n slli t2, t0, 2 # t2 = i * 4 add t2, a0, t2 # t2 = &arr[i] lw t2, (t2) # t2 = arr[i] ble t2, t1, skip # if arr[i] <= max, skip mv t1, t2 # max = arr[i] skip: addi t0, t0, 1 # i++ j loop done: mv a0, t1 # return max ret ``` --- ## findmax Data Flow
flowchart TD Start["max = arr[0], i = 1"] --> Guard{"i >= n?"} Guard -->|"yes"| Ret["return max"] Guard -->|"no"| Load["t2 = arr[i]"] Load --> Cmp{"arr[i] > max?"} Cmp -->|"yes"| Update["max = arr[i]"] Cmp -->|"no"| Inc["i++"] Update --> Inc Inc --> Guard
--- ## Key Concepts Reference | Concept | Key Point | |---------|-----------| | `lw t0, off(a0)` | Load `int` at `a0 + off` into `t0` | | Array arg | Passed as base address in `a0` | | `slli t1, t0, 2` | Multiply index by 4 (`int` stride) | | `a0`–`a7` | Argument registers | | `a0` | Return value register | | `j label` | Unconditional jump | | `bge t0, a0, done` | Branch if `t0 >= a0` (signed) | | Condition inversion | Branch on opposite to skip block | | Back-edge `j loop` | Makes an `if` into a loop | --- ## Common Bugs to Avoid - **Wrong load/store order**: `lw` has destination first; `sw` has source first - **Missing `j done`**: then-block falls through into else-block - **Not inverting condition**: branch to `body` when true — body is the next instruction anyway, and the condition is never actually tested - **Wrong stride**: using offset 1 instead of 4 for `int` arrays - **Forgetting `mv a0, result` before `ret`**: return value must be in `a0` --- ## Summary 1. **Three instruction categories**: data processing, control, memory — computation follows load → compute → store 2. **Arrays are contiguous bytes**: `arr[i]` is at `base + i*4` for `int`; use `lw t0, off(a0)` 3. **Arguments in `a0`–`a7`, return in `a0`**: array passed as base address 4. **Indexed access**: `slli t1, t0, 2` then `add` then `lw` — shift replaces multiply 5. **If/else**: branch on the *opposite* condition to skip the then-block; end then-block with `j done` 6. **Loops**: initialize → guard branch (inverted) → body → update → `j` back to guard 7. **Verify by hand or `gdb`**: trace registers through each iteration to confirm correctness