Lab 02: Getting Started with RISC-V Assembly¶
Overview¶
This was a hands-on lab session that introduced RISC-V assembly language programming through the machine code execution model: how a processor fetches, decodes, and executes instructions from memory using registers and the program counter. We walked through the structure of the given Lab02 starter code (a C main, a C implementation, and an assembly implementation linked together by a Makefile), set up GDB for instruction-level debugging, and worked through implementing simple arithmetic functions (add4, mul4) in RISC-V assembly. The session emphasized the workflow of writing assembly, building with make, debugging with GDB, keeping the Git repo clean, and running the autograder against the tests repository.
Learning Objectives¶
- Describe the machine code execution model: processor, registers, PC, instruction word, and memory
- Explain how C source becomes assembly, then object code, then a linked executable
- Identify the role of the program counter (PC) and why
PC = PC + 4for normal instruction flow - Read and write basic three-operand RISC-V instructions (
add,addi,mul) - Use the RISC-V calling convention: arguments in
a0–a7, return value ina0 - Set up and configure GDB for assembly-level debugging on the BeagleV machines
- Step through assembly with GDB: breakpoints, stepping, examining registers, and
return - Keep a Git repo clean with
make clean,git status,git rm, and run the autograder
Prerequisites¶
- Completed Lab01 (Hello World, Makefiles, Git/GitHub, the autograder)
- Completed Project01 (RISC-V dev environment, C basics, number systems)
- Access to a RISC-V dev environment (BeagleV machine or
qemu-system-riscv64VM) - Basic C programming: functions, arguments, return values
- Familiarity with the shell and
gitfrom the command line
1. The Machine Code Execution Model¶
Everything we write — C, assembly, anything — ultimately runs as machine code: binary instructions stored in memory. To reason about assembly, you need a mental model of how a processor executes those instructions. This is the central diagram from today's lab.
flowchart LR
subgraph PROC["PROCESSOR"]
direction TB
REGS["Registers<br/>x0 .. x31"]
PC["PC<br/>(Program Counter)"]
IW["IW<br/>(Instruction Word)"]
end
subgraph MEM["MEMORY"]
direction TB
STACK["STACK"]
DATA["DATA"]
CODE["CODE<br/>main()<br/>add t0, t1, t2"]
end
PC -->|address of next instruction| CODE
CODE -->|fetch 32-bit instruction| IW
IW -->|decode + execute| REGS
style PROC fill:#e8f0ff,stroke:#333,stroke-width:2px
style MEM fill:#fff0e8,stroke:#333,stroke-width:2px
style CODE fill:#f9f,stroke:#333,stroke-width:1px
The two big boxes: Processor and Memory¶
The handwritten notes drew the machine as two large boxes connected by arrows.
PROCESSOR contains:
- Registers — a small set of fast storage locations named
x0throughx31. The processor can only do arithmetic on values that are in registers. - PC (Program Counter) — holds the memory address of the next instruction to execute.
- IW (Instruction Word) — holds the current 32-bit instruction that was just fetched from memory and is being decoded/executed.
MEMORY is one large address space, conventionally divided into regions:
- STACK — grows downward; holds function call frames, saved registers, and local variables.
- DATA — globals and other statically allocated data.
- CODE — the machine code instructions, including
main()and the functions it calls. In the notes, the code region containedmain()and an instructionadd t0, t1, t2, with an arrow showing that each instruction is 32 bits wide.
Why the separation matters¶
A processor can only operate directly on register values. Memory holds both code and data. If a program needs to operate on data that lives in memory, it must first load that data into a register, compute, and then often store the result back to memory. This load/compute/store cycle is the heart of assembly programming.
2. The Fetch–Decode–Execute Cycle¶
The processor runs a simple loop forever:
flowchart TD
A["FETCH<br/>IW = memory[PC]<br/>(read 32-bit instruction)"] --> B["DECODE<br/>figure out the operation<br/>and operands"]
B --> C["EXECUTE<br/>update register(s)<br/>and/or memory"]
C --> D["UPDATE PC<br/>PC = PC + 4"]
D --> A
style A fill:#e8f0ff,stroke:#333
style D fill:#ffe8e8,stroke:#333
- Fetch — Read the 32-bit instruction at the address in
PCinto the instruction wordIW. - Decode — Determine what operation the bits encode and which registers/values are involved.
- Execute — Perform the operation. This usually updates one or more registers, but can also read or write memory.
- Update PC — Advance to the next instruction.
Why PC = PC + 4¶
The notes underlined PC = PC + 4 in red for emphasis. Each base RISC-V instruction is exactly 4 bytes (32 bits), and memory is byte-addressable. To move from one instruction to the next, the processor adds 4 to the program counter.
Address Instruction
--------- -------------------
0x1000 add t0, t1, t2 <- PC = 0x1000
0x1004 addi a0, a0, 1 <- after PC = PC + 4
0x1008 mul a0, a0, a1 <- after another PC = PC + 4
0x100C ret
PC = PC + 4 is the default. Control instructions (branches and jumps) override this default by setting the PC to a different address — that is how loops, if/else, function calls, and ret work. We will explore those later; in Lab02 the functions are straight-line code.
3. From C to Assembly to Machine Code¶
A key idea today: all code eventually runs as machine code. There are two paths to get there, and they meet at the object file (.o).
flowchart LR
C["add4_c.c<br/>(C source)"] -->|"gcc (compile + assemble)"| OC["add4_c.o"]
S["add4_s.s<br/>(assembly source)"] -->|"as (assemble)"| OS["add4_s.o"]
M["add4.c<br/>(main)"] -->|gcc| OM["add4.o"]
OC -->|link| EXE["add4<br/>(executable)"]
OS -->|link| EXE
OM -->|link| EXE
EXE -->|load + run| RUN["machine code in memory"]
style EXE fill:#f9f,stroke:#333,stroke-width:2px
Compiling vs. assembling¶
| Step | Input | Tool | Output |
|---|---|---|---|
| Compile | .c C source |
gcc |
.o object code (compiles + assembles) |
| Assemble | .s assembly source |
as |
.o object code |
| Link | one or more .o files |
gcc (or ld) |
executable |
A C compiler actually generates assembly internally, then assembles it to object code. So gcc can do everything in one step. The point is that assembly is a human-readable form of machine code: an assembler (as) translates assembly mnemonics like add into the exact binary the processor decodes.
Building by hand (what the Makefile automates)¶
# Assemble the assembly implementation to object code
as -o add4_s.o add4_s.s
# Compile the C files and link everything together
gcc -o add4 add4.c add4_c.c add4_s.o
# Or let gcc both assemble and link in one command:
gcc -o add4 add4.c add4_c.c add4_s.s
In Lab02 the provided Makefile runs these steps for you, so you normally just type make.
4. The Lab02 Starter Structure¶
You are given a Makefile, C files, and assembly files (.s). This same three-file pattern is used for most of our assembly programming. The point of the pattern is to compare a C implementation against your assembly implementation of the same function, so you can check correctness immediately.
For the example program add2 you are given:
| File | Role |
|---|---|
add2.c |
The main program — parses arguments, calls both versions, prints results |
add2_c.c |
The C implementation of the function (the reference answer) |
add2_s.s |
The assembly implementation (where you write RISC-V) |
When you build and run it:
The add2 program calls the C version and the assembly version with the same arguments. The goal is to make the two results match. You are also given a full implementation of mul2 to study as a worked example.
What main does (conceptually)¶
// add2.c (sketch of the given main)
#include <stdio.h>
#include <stdlib.h>
int add2_c(int a, int b); // prototype for the C version
int add2_s(int a, int b); // prototype for the assembly version
int main(int argc, char *argv[]) {
int a = atoi(argv[1]);
int b = atoi(argv[2]);
printf("C: %d\n", add2_c(a, b));
printf("Asm: %d\n", add2_s(a, b));
return 0;
}
The two prototypes are critical. They tell the C compiler the names and types of the functions implemented elsewhere. The linker then matches these names to the symbols in add2_c.o and add2_s.o. This is why naming conventions matter: the label in your assembly file must exactly match the name used in main, and it must be made visible with .global.
5. Anatomy of a RISC-V Instruction¶
Most RISC-V instructions have three operands: one destination (target) and two sources.
add t0, t1, t2 # t0 = t1 + t2
^ ^ ^
| | +--- src2 (second source register)
| +------- src1 (first source register)
+----------- dst (destination register)
This is exactly the instruction drawn in the code region of the execution-model diagram: add t0, t1, t2. Read it as an assignment: destination = source1 OP source2.
Instruction parts¶
| Part | Example | Meaning |
|---|---|---|
| Mnemonic | add, mul, addi |
The operation name |
| Destination | t0 |
Register that receives the result |
| Source 1 | t1 |
First input register |
| Source 2 | t2 (or an immediate) |
Second input register or constant |
| Comment | # t0 = t1 + t2 |
Everything after # is ignored |
Immediates¶
Some instructions take a constant (an immediate) instead of a second register. By convention the mnemonic ends in i:
addi t0, t1, 10 # t0 = t1 + 10 (immediate constant)
li t0, 9 # t0 = 9 (pseudo-instruction)
addi t0, zero, 9 # t0 = 0 + 9 = 9 (what li expands to)
li t0, 9 and addi t0, zero, 9 do the same thing. The zero register (x0) is hardwired to 0, which makes it handy for loading constants and copying values.
6. Registers and the Calling Convention¶
A RISC-V processor has 32 registers (x0–x31) plus the PC. In RV64 each register is 64 bits (8 bytes) wide. We almost always use the ABI names, which describe each register's conventional role.
| Hardware | ABI Name | Role |
|---|---|---|
| x0 | zero |
Always 0 (writes ignored) |
| x1 | ra |
Return address |
| x2 | sp |
Stack pointer |
| x5–x7, x28–x31 | t0–t6 |
Temporaries (caller-saved) |
| x10–x17 | a0–a7 |
Function arguments / return value (caller-saved) |
| x8–x9, x18–x27 | s0–s11 |
Saved registers (callee-saved) |
The rules you need for Lab02¶
- Arguments are passed in
a0,a1,a2,a3, ... (up toa7). - The return value is placed in
a0. - Temporaries
t0–t6are free scratch space inside a leaf function — no need to save them. - A leaf function (one that does not call any other function) does not need to touch the stack.
Lab02 functions are all leaf functions
add2, mul2, add4, and mul4 only do arithmetic and return. They make no further function calls, so you only need argument registers (a0–a3) and temporaries (t0–t6). You do not need to allocate stack space, save ra, or save any s registers. That work comes later in Project02.
Mapping the function signature to registers¶
For int add4(int a, int b, int c, int d):
a0 = a (first argument)
a1 = b (second argument)
a2 = c (third argument)
a3 = d (fourth argument)
a0 = return value (written before ret)
7. Writing Your First Assembly Function¶
A basic assembly function needs three things: a .global directive (so the linker can find it), a label matching the function name, and a ret at the end.
.global add2_s # make the symbol visible to the linker
add2_s: # label = function entry point
add a0, a0, a1 # a0 = a0 + a1 (return value in a0)
ret # return to caller (jumps to address in ra)
Walk through it against the execution model:
- The caller (
main) putsaina0andbina1, then callsadd2_s(which setsrato the return address and setsPCto theadd2_slabel). add a0, a0, a1computes the sum and leaves it ina0.retsetsPCback to the address inra, returning control tomain. The result is already ina0, wheremainexpects the return value.
Worked example: the given mul2¶
You are given mul2 as a complete example to study:
Identical structure to add2_s, just a different operation. Study this pattern — add4 and mul4 are extensions of it.
8. Lab02 Requirements: add4 and mul4¶
The lab asks you to write RISC-V implementations of two arithmetic functions. You are given the C implementations; you write the assembly.
add4 — add four 32-bit integers¶
// add4_c.c (given reference implementation)
int add4_c(int a, int b, int c, int d) {
return a + b + c + d;
}
Your assembly accumulates the four arguments. Because add is a three-operand instruction, chain the additions, reusing a0 as the running total:
.global add4_s
add4_s:
add a0, a0, a1 # a0 = a + b
add a0, a0, a2 # a0 = (a + b) + c
add a0, a0, a3 # a0 = (a + b + c) + d
ret # return a0
mul4 — multiply four 32-bit integers¶
// mul4_c.c (given reference implementation)
int mul4_c(int a, int b, int c, int d) {
return a * b * c * d;
}
.global mul4_s
mul4_s:
mul a0, a0, a1 # a0 = a * b
mul a0, a0, a2 # a0 = (a * b) * c
mul a0, a0, a3 # a0 = (a * b * c) * d
ret # return a0
Requirement checklist¶
- Write RISC-V assembly implementations for the given arithmetic problems (you are given the C versions).
- The executable must be compiled with a Makefile (run
make). - Before committing, run
make cleanso you do not add build products (executables,.ofiles). - The labs are graded with the autograder against the tests repository.
9. GDB Setup for Assembly Debugging¶
On the BeagleV machines you can install a GDB init file to get a nice text UI for assembly debugging. The file goes at ~/.config/gdb/gdbinit. Create any missing directories with mkdir -p.
$ cd
$ mkdir -p ~/.config/gdb
$ cd .config/gdb
$ cat > gdbinit
set auto-load safe-path /
set debuginfod enabled off
tui new-layout asm {-horizontal src 1 regs 1} 2 status 0 cmd 1
tui enable
layout asm
^d
^d means press CTRL-d to close the file you are typing into cat.
Common setup mistakes
- No
~/.configdirectory yet —mkdir -p ~/.config/gdbcreates the whole path at once. - Typo
-2E/-2e— the instructor noted to avoid a-2Eflag/typo in configuration; copy thegdbinitexactly as shown. - Wrong file name — it must be
gdbinit(no dot) inside~/.config/gdb, not.gdbinitin that folder.
This layout gives you three panes: the source/disassembly view, the registers view, and the command line. The registers view is the most valuable part for assembly work — you watch values change as you step.
10. Debugging Assembly with GDB¶
GDB lets you stop the fetch–decode–execute cycle at any instruction and inspect the machine state. The core workflow:
Essential commands¶
| Command | Short | What it does |
|---|---|---|
break add4_s |
b add4_s |
Set a breakpoint at the function label |
run 1 2 3 4 |
r 1 2 3 4 |
Run with command-line arguments |
stepi |
si |
Step one instruction (into calls) |
nexti |
ni |
Step one instruction (over calls) |
info registers |
i r |
Show all registers |
info registers a0 |
i r a0 |
Show one register |
print $a0 |
p $a0 |
Print a register's value |
continue |
c |
Run until the next breakpoint |
finish / return |
— | Run out of (step out of) the current function |
quit |
q |
Exit GDB |
A typical debugging session for add4¶
(gdb) break add4_s
Breakpoint 1 at 0x...: file add4_s.s
(gdb) run 1 2 3 4
Breakpoint 1, add4_s () at add4_s.s:3
(gdb) info registers a0 a1 a2 a3
a0 0x1 1
a1 0x2 2
a2 0x3 3
a3 0x4 4
(gdb) stepi # execute: add a0, a0, a1
(gdb) print $a0 # a0 should now be 3
$1 = 3
(gdb) stepi # execute: add a0, a0, a2
(gdb) print $a0 # a0 should now be 6
$2 = 6
(gdb) stepi # execute: add a0, a0, a3
(gdb) print $a0 # a0 should now be 10
$3 = 10
(gdb) continue
Use breakpoints strategically (set them at the function you care about), step instruction-by-instruction with stepi, and use return/finish to step out once you have seen what you need. Watching a0 accumulate is the fastest way to confirm your arithmetic is correct and your registers are right.
11. Git Hygiene and Running the Autograder¶
A recurring theme today: keep your repository clean. Do not commit build artifacts (executables, .o files). Always make clean first.
make clean # delete executables and .o build products
git status # verify only source files are listed
git rm --cached add4 # if a build product was already tracked, untrack it
git commit -a -m "Lab02: add4 and mul4 in assembly"
git push
Recommended workflow¶
flowchart TD
A["Edit add4_s.s / mul4_s.s"] --> B["make"]
B --> C{"C and Asm<br/>results match?"}
C -->|No| D["gdb ./add4<br/>step + inspect a0"]
D --> A
C -->|Yes| E["make clean"]
E --> F["git status<br/>(only sources?)"]
F --> G["git commit -a"]
G --> H["git push"]
H --> I["Run autograder<br/>vs tests repo"]
style C fill:#fff0c0,stroke:#333
style I fill:#d0f0d0,stroke:#333
The autograder and tests repo¶
- Get the test cases by running
git pullin the tests repository (https://github.com/USF-CS315-F25/tests). - The autograder lives at https://github.com/phpeterson-usf/autograder.
- Your lab receives the score the autograder reports.
Tests repo access issues
If the autograder cannot find the test directory: verify the test directory path (use tab completion to confirm it exists), re-add/re-clone the tests repo if needed, and make sure your GitHub email registration matches so the repository is visible to you. If you are still stuck, bring it to office hours.
Key Concepts¶
| Concept | Definition | Example |
|---|---|---|
| Machine code | Binary instructions the processor executes directly | A 32-bit encoding of add t0, t1, t2 |
| Register | Fast storage inside the processor; the only place arithmetic happens | a0, t0, x5 |
| PC (program counter) | Address of the next instruction to fetch | PC = PC + 4 for sequential code |
| IW (instruction word) | The current 32-bit instruction being decoded/executed | Fetched from the CODE region |
| Fetch–decode–execute | The processor's core loop | Read IW from memory[PC], run it, advance PC |
| Assembling | Translating .s assembly to .o object code |
as -o add4_s.o add4_s.s |
| Compiling | Translating .c C to .o object code |
gcc -c add4_c.c |
| Linking | Combining .o files into one executable |
gcc -o add4 add4.c add4_c.c add4_s.o |
| Three-operand form | op dst, src1, src2 meaning dst = src1 OP src2 |
add a0, a0, a1 |
| Calling convention | Args in a0–a7, return value in a0 |
a0=a, a1=b, ... return in a0 |
| Leaf function | A function that calls no other function | add4_s (no stack needed) |
.global |
Directive exposing a label to the linker | .global add4_s |
Practice Problems¶
Problem 1: PC arithmetic¶
A function's first instruction is at address 0x2000. Each instruction is one base RISC-V instruction. What is the address of the fourth instruction, assuming no branches or jumps?
Click to reveal solution
Each base RISC-V instruction is **4 bytes**, and sequential execution does `PC = PC + 4`. The fourth instruction is at **`0x200C`** (which is `0x2000 + 12`, i.e. `0x2000 + 3*4`).Problem 2: Translate C to assembly¶
Write a RISC-V assembly implementation add3_s for:
Click to reveal solution
Arguments arrive in `a0`, `a1`, `a2`; the return value goes in `a0`. Each `add` is three-operand (`dst, src1, src2`), and reusing `a0` as the running total avoids needing any temporaries. Because it is a leaf function, no stack work is required.Problem 3: Fix the bug¶
A student wrote this for mul4 but ./mul4 1 2 3 4 prints Asm: 6 instead of 24. Find and fix the bug.
Click to reveal solution
The function only multiplies **three** of the four arguments (`a0 * a1 * a2 = 1*2*3 = 6`). It never multiplies in the fourth argument `a3`. Add the missing instruction: Now `1*2*3*4 = 24`. This is a classic "forgot the last operand" bug — GDB makes it obvious: step through and you will see `a0` stop changing at `6` instead of reaching `24`.Problem 4: Read the registers¶
You stop in GDB at the start of add4_s and info registers shows a0=5, a1=10, a2=2, a3=1. After three stepi commands through the three add instructions in the worked example, what is in a0?
Click to reveal solution
`a0` is **18**. The function returns `5 + 10 + 2 + 1 = 18`, which matches the C reference. Confirm with `print $a0` after the third `stepi`.Problem 5: Why does the .global matter?¶
You write add4_s but forget the .global add4_s line. The build fails at the link step with an "undefined reference to add4_s" error. Explain why, and how .global fixes it.
Click to reveal solution
The `main` program (`add4.c`) declares a prototype `int add4_s(int, int, int, int);` and calls it. When the compiler produces `add4.o`, the call site references the symbol `add4_s` but does not contain its code. The **linker** is responsible for resolving that reference by finding `add4_s` in another object file. Without `.global add4_s`, the label `add4_s` in your assembly file is **local** to that object file — the linker cannot see it, so it reports "undefined reference." The `.global add4_s` directive exports the symbol so the linker can match the call in `add4.o` to the code in `add4_s.o`.Problem 6: Build the right way¶
You finished add4 and mul4. List the exact commands, in order, to build, verify cleanliness, and commit without adding build products.
Click to reveal solution
make # build add4 and mul4
./add4 1 2 3 4 # quick check: C and Asm both print 10
./mul4 1 2 3 4 # quick check: C and Asm both print 24
make clean # remove executables and .o files
git status # confirm only .c / .s / Makefile are listed
git add -A # (or add specific source files)
git commit -m "Lab02: implement add4 and mul4 in RISC-V assembly"
git push
Further Reading¶
- Lab02 spec: /assignments/lab02/
- RISC-V references and ABI: /guides/riscv/
- GDB usage and tutorials: /guides/gdb-usage/
- Course key concepts: /guides/key-concepts/
- Original handwritten notes: /notes/CS315-01 2025-09-03 Lab Lab02.pdf
- Tests repository: https://github.com/USF-CS315-F25/tests
- Autograder: https://github.com/phpeterson-usf/autograder
- Beej's Quick Guide to GDB: https://beej.us/guide/bggdb/
Summary¶
-
The machine code execution model has two parts: a processor (registers
x0–x31,PC, instruction wordIW) and memory (stack, data, and code). The processor can only compute on register values. -
The processor runs a fetch–decode–execute loop: read the 32-bit instruction at
memory[PC]intoIW, decode and execute it (updating registers and/or memory), then advance withPC = PC + 4for normal sequential code. -
All code becomes machine code. C is compiled (
gcc) and assembly is assembled (as) into object files (.o), which are then linked into one executable. Assembly is just a human-readable form of machine code. -
The Lab02 starter uses a three-file pattern (
X.cmain,X_c.cC version,X_s.sassembly version) so you can directly compare your assembly result against the C reference — they must print the same value. -
Most RISC-V instructions are three-operand:
op dst, src1, src2meansdst = src1 OP src2. Arguments arrive ina0–a7, and the return value goes ina0. -
add4andmul4are leaf functions — chainadd/mulintoa0andret. No stack, no.globalomissions: each function needs.global name, a matching label, andret. -
GDB is the tool for assembly debugging. Set up
~/.config/gdb/gdbinitfor the TUI layout, then usebreak,run,stepi, andinfo registers/print $a0to watch values change one instruction at a time. -
Keep the repo clean and verify with the autograder. Always
make cleanbefore committing so build products are never tracked, thengit pullthe tests repo and run the autograder for your score.