← Back to Course
# Lab 02: Getting Started with RISC-V Assembly ## CS 315 Computer Architecture --- ## Learning Objectives - Describe the **machine code execution model**: processor, registers, PC, IW, memory - Explain how C source becomes assembly, then object code, then a linked executable - Read and write basic RISC-V instructions: `add`, `addi`, `mul` - Apply the RISC-V **calling convention**: args in `a0`–`a7`, return value in `a0` - Debug assembly with **GDB**: breakpoints, `stepi`, `info registers` - Keep a Git repo clean and run the autograder --- ## The Machine: Two Big Boxes
flowchart LR subgraph PROC["PROCESSOR"] direction TB REGS["Registers\nx0 .. x31"] PC["PC\n(Program Counter)"] IW["IW\n(Instruction Word)"] end subgraph MEM["MEMORY"] direction TB STACK["STACK"] DATA["DATA"] CODE["CODE\nmain()\nadd t0, t1, t2"] end PC -->|address of next instruction| CODE CODE -->|fetch 32-bit instruction| IW IW -->|decode + execute| REGS
--- ## Processor vs. Memory
**PROCESSOR** - **Registers** `x0`–`x31`: fast storage; arithmetic only happens here - **PC**: address of the _next_ instruction to fetch - **IW**: the current 32-bit instruction being decoded
**MEMORY** (one address space) - **STACK**: function frames, local vars - **DATA**: globals - **CODE**: the machine instructions
A processor can
only compute on register values
. Data in memory must be loaded into a register first.
--- ## The Fetch–Decode–Execute Cycle
flowchart TD A["FETCH\nIW = memory[PC]\n(read 32-bit instruction)"] --> B["DECODE\nfigure out the operation\nand operands"] B --> C["EXECUTE\nupdate register(s)\nand/or memory"] C --> D["UPDATE PC\nPC = PC + 4"] D --> A
--- ## Why PC = PC + 4? Each base RISC-V instruction is exactly **4 bytes (32 bits)**. Memory is **byte-addressable**, so the next instruction is always 4 bytes ahead. ```text Address Instruction --------- ------------------- 0x1000 add t0, t1, t2 <- PC starts here 0x1004 addi a0, a0, 1 0x1008 mul a0, a0, a1 0x100C ret ```
PC = PC + 4
is the
default
. Branches and jumps override it — that is how loops, if/else, and function calls work.
--- ## From C to Machine Code
flowchart LR C["add4_c.c\n(C source)"] -->|"gcc"| OC["add4_c.o"] S["add4_s.s\n(assembly)"] -->|"as"| OS["add4_s.o"] M["add4.c\n(main)"] -->|gcc| OM["add4.o"] OC -->|link| EXE["add4\n(executable)"] OS -->|link| EXE OM -->|link| EXE EXE -->|load + run| RUN["machine code in memory"]
--- ## Compiling vs. Assembling vs. Linking | Step | Input | Tool | Output | |------|-------|------|--------| | **Compile** | `.c` C source | `gcc` | `.o` object code | | **Assemble** | `.s` assembly | `as` | `.o` object code | | **Link** | one or more `.o` | `gcc` / `ld` | executable | ```bash as -o add4_s.o add4_s.s # assemble gcc -o add4 add4.c add4_c.c add4_s.o # link # or let gcc do it all: gcc -o add4 add4.c add4_c.c add4_s.s ``` Assembly is a **human-readable form of machine code**. --- ## Lab02 Starter: Three-File Pattern ```text add4.c add4_c.c add4_s.s ``` | File | Role | |------|------| | `add4.c` | `main` — parses args, calls both versions, prints results | | `add4_c.c` | **C reference implementation** (given, do not modify) | | `add4_s.s` | **Your RISC-V assembly** implementation | ```text $ ./add4 1 2 3 4 C: 10 Asm: 10 ``` **Goal**: make the two results match. --- ## How main Calls Both Versions ```c // add4.c (sketch) int add4_c(int a, int b, int c, int d); // prototype for C version int add4_s(int a, int b, int c, int d); // prototype for asm version int main(int argc, char *argv[]) { int a = atoi(argv[1]), b = atoi(argv[2]); int c = atoi(argv[3]), d = atoi(argv[4]); printf("C: %d\n", add4_c(a, b, c, d)); printf("Asm: %d\n", add4_s(a, b, c, d)); return 0; } ``` The **linker** matches the prototype name to the label in your `.s` file — naming must match exactly. --- ## Anatomy of a RISC-V Instruction ```text add t0, t1, t2 # t0 = t1 + t2 ^ ^ ^ | | +--- src2 (second source register) | +------- src1 (first source register) +----------- dst (destination register) ``` Read it as an assignment: **dst = src1 OP src2** | Part | Example | Meaning | |------|---------|---------| | Mnemonic | `add`, `mul`, `addi` | The operation | | Destination | `t0` | Register that gets the result | | Source 1 | `t1` | First input | | Source 2 | `t2` or immediate | Second input or constant | --- ## Immediate Instructions Instead of a second register, some instructions take a **constant** (immediate). Mnemonic ends in `i` by convention: ```text addi t0, t1, 10 # t0 = t1 + 10 li t0, 9 # pseudo: t0 = 9 addi t0, zero, 9 # what li expands to: 0 + 9 ``` `zero` (`x0`) is **hardwired to 0** — writes are ignored. --- ## RISC-V Registers (ABI Names) | Hardware | ABI Name | Role | |----------|----------|------| | x0 | `zero` | Always 0 (writes ignored) | | x1 | `ra` | Return address | | x2 | `sp` | Stack pointer | | x5–x7, x28–x31 | `t0`–`t6` | Temporaries (caller-saved) | | x10–x17 | `a0`–`a7` | Arguments / return value | | x8–x9, x18–x27 | `s0`–`s11` | Saved registers (callee-saved) | RV64: each register is **64 bits (8 bytes)** wide. --- ## Calling Convention for Lab02 ```text a0 = first argument (also the return value) a1 = second argument a2 = third argument a3 = fourth argument ```
Lab02 functions are all leaf functions
— they call no other function. You only need
a0
–
a3
and temporaries. No stack allocation, no saving
ra
, no
s
registers.
--- ## Your First Assembly Function Three required parts: `.global`, a matching label, and `ret`: ```text .global add2_s # make symbol visible to the linker add2_s: # label = function entry point add a0, a0, a1 # a0 = a0 + a1 (result in a0) ret # return to caller (jumps to ra) ``` 1. Caller puts `a` in `a0`, `b` in `a1`, then jumps to `add2_s` 2. `add a0, a0, a1` computes the sum 3. `ret` sets PC = ra, returning to `main` --- ## Worked Example: mul2 Given to you as a complete example: ```text .global mul2_s mul2_s: mul a0, a0, a1 # a0 = a0 * a1 ret ``` Study this pattern. `add4` and `mul4` are extensions of it: same structure, more operands. --- ## Implementing add4 ```c // C reference (given) int add4_c(int a, int b, int c, int d) { return a + b + c + d; } ``` Chain additions, reusing `a0` as the running total: ```text .global add4_s add4_s: add a0, a0, a1 # a0 = a + b add a0, a0, a2 # a0 = (a+b) + c add a0, a0, a3 # a0 = (a+b+c) + d ret ``` ```text $ ./add4 1 2 3 4 C: 10 Asm: 10 ``` --- ## Implementing mul4 ```c // C reference (given) int mul4_c(int a, int b, int c, int d) { return a * b * c * d; } ``` Same pattern with `mul`: ```text .global mul4_s mul4_s: mul a0, a0, a1 # a0 = a * b mul a0, a0, a2 # a0 = (a*b) * c mul a0, a0, a3 # a0 = (a*b*c) * d ret ``` ```text $ ./mul4 1 2 3 4 C: 24 Asm: 24 ``` --- ## Why .global Matters Without `.global add4_s` the linker cannot see the label: ```text $ make /usr/bin/ld: add4.o: undefined reference to `add4_s` ``` `.global` **exports** the symbol so the linker can match the call in `add4.o` to the code in `add4_s.o`. ```text .global add4_s # without this: linker error add4_s: ... ret ``` --- ## GDB Setup (BeagleV) One-time setup at `~/.config/gdb/gdbinit`: ```bash mkdir -p ~/.config/gdb cat > ~/.config/gdb/gdbinit set auto-load safe-path / set debuginfod enabled off tui new-layout asm {-horizontal src 1 regs 1} 2 status 0 cmd 1 tui enable layout asm ``` Press **Ctrl-D** to save. Three panes: disassembly | **registers** | command line. The registers view lets you watch values change instruction-by-instruction. --- ## Essential GDB Commands | Command | Short | What it does | |---------|-------|--------------| | `break add4_s` | `b add4_s` | Breakpoint at function label | | `run 1 2 3 4` | `r 1 2 3 4` | Run with arguments | | `stepi` | `si` | Step one instruction | | `info registers a0` | `i r a0` | Show register value | | `print $a0` | `p $a0` | Print register | | `continue` | `c` | Run to next breakpoint | | `finish` | — | Step out of current function | | `quit` | `q` | Exit GDB | --- ## GDB Debugging Session: add4 ```text (gdb) break add4_s (gdb) run 1 2 3 4 Breakpoint 1, add4_s () at add4_s.s:3 (gdb) info registers a0 a1 a2 a3 a0 0x1 1 a1 0x2 2 a2 0x3 3 a3 0x4 4 (gdb) stepi # add a0, a0, a1 (gdb) print $a0 # => 3 (gdb) stepi # add a0, a0, a2 (gdb) print $a0 # => 6 (gdb) stepi # add a0, a0, a3 (gdb) print $a0 # => 10 ``` Watch `a0` accumulate — the fastest way to verify arithmetic. --- ## Development Workflow
flowchart TD A["Edit add4_s.s / mul4_s.s"] --> B["make"] B --> C{"C and Asm\nresults match?"} C -->|No| D["gdb ./add4\nstep + inspect a0"] D --> A C -->|Yes| E["make clean"] E --> F["git status\n(only sources?)"] F --> G["git commit -a"] G --> H["git push"] H --> I["Run autograder\nvs tests repo"]
--- ## Git Hygiene Always `make clean` **before** committing: ```bash make clean # remove executables and .o files git status # confirm only .c / .s / Makefile listed git rm --cached add4 # untrack a build product if already committed git commit -a -m "Lab02: add4 and mul4 in RISC-V assembly" git push ```
Never commit build artifacts (executables,
.o
files). The autograder builds from source.
--- ## The Autograder - Pull test cases: `git pull` in the **tests** repo `https://github.com/USF-CS315-F25/tests` - Autograder: `https://github.com/phpeterson-usf/autograder` - Your lab score = what the autograder reports **Common issue**: autograder can't find test directory → use tab completion to verify the path exists → re-clone the tests repo if needed → check your GitHub email registration --- ## Key Concepts Reference | Concept | Definition | |---------|------------| | **Register** | Fast storage in the processor; arithmetic only here | | **PC** | Address of the next instruction to fetch | | **IW** | Current 32-bit instruction being decoded | | **Fetch–decode–execute** | The processor's core loop | | **Three-operand form** | `op dst, src1, src2` means `dst = src1 OP src2` | | **Calling convention** | Args in `a0`–`a7`, return value in `a0` | | **Leaf function** | Calls no other function; no stack needed | | **`.global`** | Exports a label to the linker | --- ## Lab02 Requirements Checklist 1. Write RISC-V assembly for `add4` and `mul4` (C versions are given) 2. Build with `make` (Makefile is provided) 3. Run `make clean` before committing 4. No `.o` files or executables in the repo 5. Push to GitHub and verify with the autograder
Reference material: Lab02 spec at
/assignments/lab02/
, RISC-V ABI guide at
/guides/riscv/
, GDB guide at
/guides/gdb-usage/
--- ## Summary 1. **Machine code execution model**: processor (registers, PC, IW) + memory (stack, data, code); arithmetic only on register values 2. **Fetch–decode–execute loop**: read IW from `memory[PC]`, execute, then `PC = PC + 4` 3. **All code becomes machine code**: C compiled, assembly assembled, both linked into one executable 4. **Three-file starter pattern**: `X.c` main + `X_c.c` reference + `X_s.s` your assembly 5. **Three-operand instructions**: `op dst, src1, src2`; args arrive in `a0`–`a3`; return in `a0` 6. **`add4` and `mul4`**: chain `add`/`mul` into `a0`, then `ret`; need `.global`, label, `ret` 7. **GDB**: set up `gdbinit` for TUI, use `break`/`run`/`stepi`/`print $a0` 8. **Clean commits**: `make clean` before every commit; verify with the autograder