` | Standard system directories |
| `"header.h"` | Your project directory first |
---
## Building with Multiple Files
```bash
gcc -o numinfo numinfo.c numhelpers.c
gcc -o numconv numconv.c numhelpers.c
```
`numhelpers.c` appears in both commands — each executable needs the definitions linked in.
---
## The C Build Pipeline
flowchart LR
A["source.c"] -->|"preprocess"| B["expanded C"]
B -->|"compile"| C["assembly"]
C -->|"assemble"| D["source.o\nobject code"]
D --> L["Linker"]
S["startup code\nlibc"] --> L
L --> E["executable"]
1. **Preprocess** — expand `#include` / `#define`
2. **Compile** — C to assembly
3. **Assemble** — assembly to machine code (`.o`)
4. **Link** — combine objects + startup code
---
## File Naming Conventions
| Suffix | Meaning | Example |
|--------|---------|---------|
| `.c` | C source | `add2_c.c` |
| `.s` | Assembly source | `add2_s.s` |
| `.o` | Object code (not yet linked) | `add2_s.o` |
| (none) | Linked executable | `add2` |
main is not the first code to run — the C runtime startup sets up the stack, argc/argv, then calls main.
---
## What Is Assembly Language?
Assembly language is a human-readable form of machine code.
flowchart LR
A["Assembly\nadd t0, t1, t2"] -->|"assembler (as)"| B["Machine code\n0x006283B3"]
B -->|"loaded + executed"| C["Processor"]
- (Almost) one-to-one: one assembly instruction = one machine instruction
- An **assembler** (`as`) translates assembly, just as a **compiler** (`gcc`) translates C
---
## The Processor's Three Core Elements
| Element | Description |
|---------|-------------|
| **Registers** | Small, fast storage inside the CPU; arithmetic happens here |
| **Memory** | Holds code and data; large but slower than registers |
| **Instructions** | Operations the CPU knows how to perform |
**Load/store architecture**: to compute on data in memory, you must load it into a register first, operate, then store it back.
---
## RISC-V
- **Open-standard** ISA — no licensing fees, anyone can implement it
- Stands for "Reduced Instruction Set Computer V" (pronounced "risk-five")
- Simple, clean, modular design — ideal for learning
- This course uses **RV64IM**: 64-bit, integer + multiply/divide extensions
---
## Anatomy of an Instruction
```text
operands
/--------\
add t0, t1, t2
| | | |
| | | +-- source register (src2)
| | +------ source register (src1)
| +---------- destination register (dst)
+---------------- mnemonic (instruction name)
```
`add t0, t1, t2` means **`t0 = t1 + t2`**
Destination comes **first** — mirrors an assignment statement.
---
## Comments and Syntax
```text
add t0, t1, t2 # t0 = t1 + t2
```
- Comments begin with `#` and run to end of line
- Operands separated by commas
- Order matters: getting sources and destination wrong is a silent bug
The assembler encodes exactly what you wrote — it won't warn you if you swapped operands.
---
## The RISC-V Register File
RV64: 32 registers, each 64 bits (8 bytes) wide.
```text
x0 x1 x2 ... x31
+----+----+----+ +----+
| 64 | 64 | 64 | ... | 64 | bits each
+----+----+----+ +----+
```
Every register has two names:
- **Numeric**: `x0`, `x1`, ..., `x31`
- **ABI name**: `zero`, `ra`, `sp`, `a0`, `t0`, ...
ABI names describe the register's **conventional role** — prefer them in code.
---
## Register Table (Partial)
| Register | ABI Name | Conventional Use |
|----------|----------|------------------|
| `x0` | `zero` | Hardwired constant 0 |
| `x1` | `ra` | Return address |
| `x2` | `sp` | Stack pointer |
| `x5`–`x7` | `t0`–`t2` | Temporaries |
| `x10`–`x11` | `a0`–`a1` | Arguments / return values |
| `x12`–`x17` | `a2`–`a7` | Arguments |
| `x18`–`x27` | `s2`–`s11` | Saved registers |
| `x28`–`x31` | `t3`–`t6` | Temporaries |
---
## Registers You'll Use Most
| Register(s) | Role |
|-------------|------|
| `a0`–`a7` | Function arguments; `a0` also holds return value |
| `t0`–`t6` | Temporary / scratch registers |
| `zero` | Always reads as 0; writes discarded |
---
## The `zero` Register
`x0` is hardwired to zero — reads always return 0, writes are silently discarded.
```text
addi a0, zero, 5 # a0 = 0 + 5 -> load constant 5
add a0, a1, zero # a0 = a1 + 0 -> copy a1 into a0
sub a0, zero, a1 # a0 = 0 - a1 -> negate a1
```
A hardwired zero eliminates the need for separate "load constant," "move," and "negate" instructions — they all fall out of arithmetic with zero.
---
## `add` vs `addi`
**`add` — register + register**
```text
add t0, t1, t2 # t0 = t1 + t2
```
**`addi` — register + immediate (constant)**
```text
addi t0, t1, 9 # t0 = t1 + 9
```
The `i` suffix means **immediate**: a constant encoded inside the instruction word.
RISC-V immediates are **12-bit signed** values: range −2048 to 2047.
---
## Loading a Constant
```text
addi t0, zero, 9 # t0 = 0 + 9 = 9
li t0, 9 # same thing — pseudo-instruction
```
`li` is **assembler sugar** that expands to `addi rd, zero, imm`.
| Form | Meaning |
|------|---------|
| `add rd, rs1, rs2` | `rd = rs1 + rs2` |
| `addi rd, rs1, imm` | `rd = rs1 + imm` |
| `addi rd, zero, imm` | `rd = imm` (load constant) |
| `li rd, imm` | `rd = imm` (pseudo-instruction) |
No subi! Subtract a constant with a negative immediate: addi t0, t0, -1
---
## 32-bit Values in 64-bit Registers
RV64 registers are 64 bits wide, but C `int` is 32 bits — that's fine.
```text
63 32 31 0
+------------------+------------------+
| upper 32 bits | lower 32 bits | <- 32-bit int lives here
+------------------+------------------+
```
- 32-bit values occupy the **low 32 bits**
- `add` / `addi` work on the full 64-bit width
- Use word instructions (`addw`, `lw`) when 32-bit overflow/sign behavior matters
---
## A First Assembly Function
C reference:
```c
int add2_c(int a, int b) { return a + b; }
```
RISC-V assembly:
```asm
.global add2_s
# a0 - int a (first argument)
# a1 - int b (second argument)
# Return value goes in a0
add2_s:
add a0, a0, a1 # a0 = a + b
ret
```
- `.global add2_s` — exports the label so C code can call it
- `add2_s:` — label marking the function entry point
- `ret` — return to caller
---
## The Calling Convention
sequenceDiagram
participant C as C main()
participant ASM as add2_s
C->>ASM: a0=3, a1=4, call add2_s
Note over ASM: add a0, a0, a1 -> a0=7
ASM-->>C: return (a0=7)
Note over C: prints "Asm: 7"
- **Arguments** arrive in `a0`, `a1`, `a2`, ... in order
- **Return value** must be in `a0` when `ret` executes
---
## Mixing C and Assembly
```c
// add2.c — main driver
int add2_c(int a, int b); // C version
int add2_s(int a, int b); // assembly version
int main(int argc, char *argv[]) {
int a = atoi(argv[1]), b = atoi(argv[2]);
printf("C: %d\n", add2_c(a, b));
printf("Asm: %d\n", add2_s(a, b));
}
```
Build:
```bash
gcc -o add2 add2.c add2_c.c add2_s.s
```
```text
$ ./add2 3 4
C: 7
Asm: 7
```
---
## add3_s: Chaining Two Adds
```c
int add3_c(int a, int b, int c) { return a + b + c; }
```
```asm
.global add3_s
# a0=a, a1=b, a2=c
add3_s:
add a0, a0, a1 # a0 = a + b
add a0, a0, a2 # a0 = (a + b) + c
ret
```
`add` takes only two source registers — chain additions to accumulate the result.
---
## Common Idioms Summary
| Instruction | Meaning |
|-------------|---------|
| `add rd, rs1, rs2` | `rd = rs1 + rs2` |
| `addi rd, rs1, imm` | `rd = rs1 + imm` |
| `li rd, imm` | `rd = imm` (pseudo) |
| `add rd, rs1, zero` | `rd = rs1` (copy/move, pseudo `mv`) |
| `sub rd, zero, rs1` | `rd = -rs1` (negate) |
| `addi rd, rd, -1` | `rd = rd - 1` (no subi!) |
---
## Key Terms
| Term | Definition |
|------|------------|
| **Mnemonic** | Instruction name (`add`, `addi`, `ret`) |
| **Operand** | Register or constant an instruction acts on |
| **Immediate** | Constant encoded inside an instruction |
| **ABI name** | Conventional register name by role (`a0`, `t0`) |
| **Label** | Named code location (`add2_s:`) |
| **`.global`** | Directive exporting a label to the linker |
| **Pseudo-instruction** | Assembler shorthand expanding to real instructions |
---
## Summary
1. **Shared helpers** go in `numhelpers.c` + `numhelpers.h`; use include guards and prototypes
2. **Build pipeline**: preprocess → compile → assemble → link; `main` is not the first code to run
3. **Assembly** is human-readable machine code; assembler (`as`) translates it one-to-one
4. **Instruction format**: `mnemonic destination, src1, src2` — destination always first
5. **32 registers, 64 bits each**; use ABI names (`a0`, `t0`, `zero`)
6. **`zero` register** enables load-constant, copy, and negate without extra instructions
7. **`addi`** adds an immediate; `li rd, imm` is sugar for `addi rd, zero, imm`
8. **Simple function**: `.global` label, arguments in `a0`/`a1`/..., return value in `a0`, end with `ret`