Lab: Processor JAL and JALR¶
Overview¶
This is a hands-on lab session that takes our single-cycle RISC-V processor from "can execute a straight line of arithmetic" to "can call a function and return from it." Up to this point the processor only ran I-type and R-type instructions, so the program counter marched forward by PC + 4 every cycle and never did anything else. Function calls require control flow: the ability to jump somewhere else and, crucially, to remember where to come back to. We implement the two RISC-V instructions that make this possible: jal (jump and link, the call primitive) and jalr (jump and link register, the ret primitive). The work is partly conceptual (what these instructions mean at the register/PC level) and partly Digital schematic engineering (the new MUXes and datapath wires, plus the extra control lines coming out of the instruction decoder). By the end you will be able to run a program that does jal first_s ... ret inside your own processor circuit, which is exactly what Lab 10 Part 2 asks for.
Learning Objectives¶
- Describe precisely what
jalandjalrdo to the destination register and the PC - Explain why a return address (
PC + 4) must be saved in a link register before a jump - Decode the J-type and I-type instruction formats well enough to compute jump targets by hand
- Recognize
call,j, andretas pseudo-instructions built onjal/jalr - Add the
PCsel,WDsel, andALUSrcAMUXes and the widenedALUSrcBMUX to the datapath - Extend the instruction-decoder spreadsheet with new control outputs and keep existing rows correct
- Trace a
jal/retprogram cycle-by-cycle through the processor and predict every register and PC value - Diagnose the most common JAL/JALR wiring and decoder bugs
Prerequisites¶
- A working Lab 09 / Part 1 processor that executes
addi,add, andli(thefirst_sprogram) - The Digital schematic editor installed and able to open and simulate your
.digcircuits - RISC-V instruction formats from Lab 03 (R-type, I-type, J-type bit layouts)
- The ABI calling conventions:
rais the return-address register,a0-a7hold arguments and results - Comfort with MUXes, ROMs, priority encoders, and comparators from Lab 09
- The decoder-spreadsheet methodology from Processor Guide Part 2
1. Why We Need Jumps at All¶
Every program our processor has run so far is a straight line. The PC starts at the first instruction and increments by 4 each cycle:
flowchart LR
A["PC = 0<br/>li a0, 1"] --> B["PC = 4<br/>li a1, 2"]
B --> C["PC = 8<br/>add a2, a0, a1"]
C --> D["PC = 12<br/>unimp (halt)"]
This is fine for arithmetic, but real programs are organized into functions, and calling a function means two things the straight-line model cannot do:
- Transfer control to an instruction that is not
PC + 4(jump to the function body). - Remember where we came from so that when the function finishes, control can return to the instruction after the call.
The second requirement is the subtle one. A jump that forgets where it came from is a one-way trip. The trick that makes functions work is the link: at the moment of the jump, we save the address of the next sequential instruction (PC + 4) into a register. That saved address is the return address. RISC-V's convention is to keep it in register ra (which is x1).
flowchart TD
M0["main: PC=0 li a0,1"] --> M1["PC=4 li a1,2"]
M1 --> M2["PC=8 jal first_s"]
M2 -->|"ra = PC+4 = 12<br/>PC = address of first_s"| F0["first_s: add a0,a0,a1"]
F0 --> F1["ret"]
F1 -->|"PC = ra = 12"| M3["PC=12 unimp (halt)"]
The two boxed transitions are exactly what jal and ret (which is jalr) implement. jal does the call-and-remember, ret does the return.
2. JAL: Jump And Link (the call primitive)¶
The handwritten lab notes summarize jal in three lines, which we will expand. jal is a J-type instruction. It does two things in a single cycle:
jal rd, imm # rd is the link/destination register, imm is a PC-relative offset
Step 1 (link): rd <- PC + 4 # save return address into rd
Step 2 (jump): PC <- PC + imm # jump PC-relative by the J-type immediate
Two important points:
- The destination register
rdreceivesPC + 4, the address of the instruction after the jal. This is the "link." - The PC is updated to
PC + imm. The immediate is PC-relative: the jump target is computed by adding a signed offset to the current PC, not by loading an absolute address.
The call and j pseudo-instructions¶
You rarely write raw jal. The assembler gives you friendlier names that expand to jal:
| You write | Assembler emits | Meaning |
|---|---|---|
call first_s |
jal ra, first_s |
Save return addr in ra, jump to first_s |
jal first_s |
jal ra, first_s |
Same as call (rd defaults to ra) |
j label |
jal zero, label |
Plain jump; throw the link away into x0 |
The j (plain jump) case is the key insight into why a single instruction covers both calls and gotos: if you do not need the return address, you link into x0 (the zero register), which discards the write. So j is just jal that does not care about the link.
J-type encoding¶
jal uses the J-type format. The 20-bit immediate is scrambled across the instruction word and must be reassembled, then sign-extended and treated as a multiple of 2 (the low bit is always 0):
J-type: | imm[20] | imm[10:1] | imm[11] | imm[19:12] | rd | opcode |
bits: | 31 | 30:21 | 20 | 19:12 | 11:7 | 6:0 |
opcode for jal = 0b1101111
imm = sign_extend( imm[20]<<20 | imm[19:12]<<12 | imm[11]<<11 | imm[10:1]<<1 )
In hardware your ImmDecoder already produces this imm-J value from the instruction word. The processor never reassembles bits by hand at runtime; the decoder does it combinationally.
3. JALR: Jump And Link Register (the ret primitive)¶
jalr is the register-based cousin of jal. The handwritten notes describe it as "jump to register value and link." It is an I-type instruction:
jalr rd, rs1, imm # rd is link reg, rs1 holds a base address, imm is a small offset
Step 1 (link): rd <- PC + 4 # same link behavior as jal
Step 2 (jump): PC <- rs1 + imm # jump to a REGISTER value plus immediate
The difference from jal is where the target comes from:
| Instruction | Target computation | Addressing style |
|---|---|---|
jal |
PC = PC + imm |
PC-relative (offset is fixed) |
jalr |
PC = rs1 + imm |
Register-relative (computed) |
Because the target is in a register, jalr can jump to an address that was computed at runtime, which is exactly what a return needs: the return address was stored in ra by an earlier jal.
The ret pseudo-instruction¶
ret is the most common use of jalr. The handwritten notes call out the two special values that make it a return:
rdisx0(the notes' "always 0 for ret"). We do not want to overwrite the link this time, so we discardPC + 4into the zero register.immis 0 (the notes' "imm is 0 for ret"). We jump exactly to the address inra, with no offset.
So ret reduces to PC <- ra + 0 = ra, and that is how a function returns to its caller. The ra it reads was placed there by the call/jal that invoked the function.
flowchart LR
subgraph CALL["jal ra, first_s"]
direction TB
A1["ra <- PC+4"] --> A2["PC <- PC+imm"]
end
subgraph RET["ret = jalr x0, ra, 0"]
direction TB
B1["x0 <- PC+4 (discarded)"] --> B2["PC <- ra + 0"]
end
CALL --> RET
4. JAL vs JALR Side by Side¶
It is worth memorizing this comparison; almost every JAL/JALR bug comes from mixing up which field drives the PC.
| Property | jal |
jalr |
|---|---|---|
| Format | J-type | I-type |
| opcode | 0b1101111 |
0b1100111 |
Link (write to rd) |
rd = PC + 4 |
rd = PC + 4 |
| PC update | PC = PC + imm |
PC = rs1 + imm |
| Immediate source | imm-J (20-bit) |
imm-I (12-bit) |
| Uses a source reg? | No | Yes (rs1) |
| Common alias | call, j |
ret |
| Target known at... | assemble time | run time |
The shared behavior is the link: both write PC + 4 into rd. The difference is purely in how the PC target is formed: a fixed PC-relative offset for jal, a runtime register value plus offset for jalr.
5. The Lab 10 Programs¶
The lab is incremental. Part 1 (already done) runs the straight-line program; Part 2 adds the function call.
Part 1 — straight line (no jumps):
Part 2 — a real function call and return:
main:
li a0, 1
li a1, 2
jal first_s # ra = PC+4, jump into first_s
unimp # control returns HERE after ret
first_s:
add a0, a0, a1 # a0 = 1 + 2 = 3
ret # jalr x0, ra, 0 -> PC = ra
A few things to notice that bite people when preparing programs for the processor (these come straight from the project guidance):
- Use
jal, notcall. Both assemble to the same machine code, but usingjalexplicitly matches the instruction your decoder recognizes. - There must be a
main. Execution starts at the top of instruction memory. unimphalts. It is your end-of-program marker, and afterretwe land back on theunimpthat follows thejal.liis a pseudo-instruction.li a0, 1assembles toaddi a0, zero, 1, which your Part 1 decoder already handles.
Building the .hex and checking the encoding¶
Assemble the program and disassemble it so you can see the exact instruction words your ROM will hold and confirm the jump offsets:
# assemble the .s to an object file
riscv64-unknown-elf-as -o lab10-part2.o lab10-part2.s
# disassemble to see opcodes, fields, and the jal/ret expansion
riscv64-unknown-elf-objdump -d lab10-part2.o
# generate the .hex you load into the Digital ROM (course tool)
python3 makerom3.py lab10-part2.o > lab10-part2.hex
A representative objdump excerpt (addresses and exact encodings depend on your assembler) shows how the pseudo-instructions expand:
0: 00100513 li a0,1 # addi a0,zero,1
4: 00200593 li a1,2 # addi a1,zero,2
8: 008000ef jal ra,10 # jal ra, first_s (offset +8)
c: 0000 unimp
10: 00b50533 add a0,a0,a1 # first_s
14: 00008067 ret # jalr zero,0(ra)
Notice that ret disassembles as jalr zero,0(ra) — rd = zero, rs1 = ra, imm = 0, exactly the two special values from Section 3.
6. New Datapath Components¶
Recall the Part 1 datapath: instruction memory feeds the decoders, the register file feeds the ALU, the ALU result writes back, and the PC always advances by PC + 4. To support jal/jalr we add three new MUXes and widen one existing MUX. The Part 2 guide lists exactly these additions.
| New/changed MUX | Chooses between | Controlled by | Why we need it |
|---|---|---|---|
PCsel |
PC + 4 vs. branch/jump target |
PCsel |
A jump must override the default PC + 4 |
WDsel |
ALU result vs. PC + 4 |
WDsel |
The link writes PC + 4 (not the ALU) into rd |
ALUSrcA |
register RD0 vs. PC |
ALUSrcA |
jal target is PC + imm, so the ALU's A input is PC |
ALUSrcB (wider) |
RD1 vs. imm-I vs. imm-J |
ALUSrcB |
Different jumps use different immediates |
The intuition for each:
PCsel: Until now the next PC was hardwired toPC + 4. A jump produces a target address that must win instead.PCselselects between the sequentialPC + 4and the computed target. Forjal/jalrit always selects the target; for ordinary instructions it selectsPC + 4.WDsel(write-data select): Foradd/addithe register file's write data comes from the ALU. For the link step ofjal/jalr, the write data isPC + 4.WDselpicks which one reaches the register file'sWDinput.ALUSrcA: The ALU normally takesRD0(the value ofrs1) on its A input. To computejal's targetPC + imm, we instead feedPCinto A.ALUSrcAchooses register-vs-PC for the ALU's first operand.- Widened
ALUSrcB: In Part 1 this was a single bit (register vs. immediate). Now it needs to choose amongRD1, the I-type immediate, and the J-type immediate, so it grows to a 2-bit selector.
How each instruction computes its target in the ALU¶
The clean realization is that the same ALU computes every jump target; only the operands change:
jal: ALU.A = PC, ALU.B = imm-J, ALUOp = add -> target = PC + imm-J
jalr: ALU.A = RD0, ALU.B = imm-I, ALUOp = add -> target = rs1 + imm
add: ALU.A = RD0, ALU.B = RD1, ALUOp = add -> rd = rs1 + rs2
addi: ALU.A = RD0, ALU.B = imm-I, ALUOp = add -> rd = rs1 + imm
So the ALU does one add in all four cases. The MUXes in front of it (ALUSrcA, ALUSrcB) and behind it (PCsel, WDsel) are what specialize the behavior.
Datapath sketch¶
+--------+
PC ---->| |
| ALUSrcA|---A--->+-----+
RD0 (rs1)--->| MUX | | |
+--------+ | ALU |---R--+--> WDsel MUX --> RegFile WD
| | | ^
RD1 (rs2)--->+--------+ ----B>| | | |
imm-I ------>| ALUSrcB| +-----+ PC+4 -------+ (link value)
imm-J ------>| MUX |
+--------+
PC+4 ---->+--------+
| PCsel |---> next PC (into PC register)
ALU target ->| MUX |
+--------+
The PC + 4 value is itself produced by a small dedicated adder (the same one that drove the PC in Part 1); now it fans out to three places: the default PCsel input, the WDsel link input, and nowhere else changes.
7. Extending the Instruction Decoder¶
The InstDecoder is the spreadsheet-driven control unit from Part 2. Each supported instruction gets a row; the row's Inputs (opcode, funct3, funct7) identify the instruction, and the row's Control outputs drive the datapath. To add jal and jalr we (1) add two rows and (2) add columns for the new control lines, then (3) backfill the new columns for the existing rows.
The cardinal rule from the guide: when you add a new control output, set it to its inert value (usually 0) for every existing instruction. MUXes are wired so input 0 is the original Part 1 behavior, so writing 0 in the new columns leaves addi/add working exactly as before.
A decoder table for the four instructions in the Part 2 program looks like this:
| INUM | Instr | Fmt | opcode | funct3 | funct7 | RFW | ALUOp | ALUSrcB | ALUSrcA | WDsel | PCsel |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | addi | I | 0010011 | 000 | xxxxxxx | 1 | 000 | 01 | 0 | 0 | 0 |
| 1 | add | R | 0110011 | 000 | 0000000 | 1 | 000 | 00 | 0 | 0 | 0 |
| 2 | jal | J | 1101111 | xxx | xxxxxxx | 1 | 000 | 10 | 1 | 1 | 1 |
| 3 | jalr | I | 1100111 | 000 | xxxxxxx | 1 | 000 | 01 | 0 | 1 | 1 |
Reading the new rows:
jal(INUM 2):RFW=1so we write the link intord.ALUSrcA=1selects PC into the ALU's A input.ALUSrcB=10selectsimm-J.WDsel=1so the write data isPC + 4(the link), not the ALU result.PCsel=1so the PC takes the jump target.jalr(INUM 3):ALUSrcA=0keepsRD0(=rs1=rafor aret) on the ALU's A input.ALUSrcB=01selectsimm-I(which is 0 forret).WDsel=1writes the link;PCsel=1takes the jump. Forretthe destination isx0, so the write is harmlessly discarded even thoughRFW=1.addi,add(INUM 0, 1): the three new columns (ALUSrcA,WDsel,PCsel) are all0, preserving Part 1 behavior. NoteALUSrcBwidened to two bits, so the old "1" becomes "01" and the old "0" becomes "00".
From spreadsheet to circuit¶
The decoder circuit pattern is unchanged from Part 2; you just have more of everything:
flowchart LR
IW["Instruction Word"] --> SP["splitters:<br/>opcode, funct3, funct7"]
SP --> CMP["comparators:<br/>one per instruction"]
CMP --> PE["priority encoder<br/>-> INUM"]
PE --> ROM["control ROM<br/>indexed by INUM"]
ROM --> SPL["splitter:<br/>break ROM word into<br/>control lines"]
SPL --> CTL["RFW, ALUOp, ALUSrcB,<br/>ALUSrcA, WDsel, PCsel"]
Two concrete changes versus Part 1:
- More comparators / wider priority encoder. You add comparators that match
opcode == 1101111(jal) andopcode == 1100111(jalr), and feed them into the priority encoder so they map to INUM 2 and 3. - Wider ROM word. Each ROM entry must now hold all the control bits. With
RFW(1) +ALUOp(3) +ALUSrcB(2) +ALUSrcA(1) +WDsel(1) +PCsel(1) you need an 9-bit ROM word. The output splitter must extract each field at the correct bit positions — and the bit order must match the spreadsheet's "Output bits" column exactly.
8. Cycle-by-Cycle Trace of the Part 2 Program¶
Let us execute the Part 2 program one clock cycle at a time. Assume instruction memory is laid out at byte addresses 0, 4, 8, ... and unimp sits at address 12 (0xc), first_s at 16 (0x10).
addr instruction
0: addi a0, zero, 1 (li a0,1)
4: addi a1, zero, 2 (li a1,2)
8: jal ra, 16 (jal first_s ; offset = 16-8 = +8)
12: unimp
16: add a0, a0, a1 (first_s)
20: jalr zero, ra, 0 (ret)
| Cycle | PC | Instruction | What happens | a0 | a1 | ra | next PC |
|---|---|---|---|---|---|---|---|
| 1 | 0 | addi a0,zero,1 |
ALU: 0+1 → a0; WDsel=ALU; PCsel=PC+4 | 1 | – | – | 4 |
| 2 | 4 | addi a1,zero,2 |
ALU: 0+2 → a1 | 1 | 2 | – | 8 |
| 3 | 8 | jal ra,16 |
link: ra ← PC+4 = 12; ALU: PC+imm = 8+8 = 16; PCsel=target | 1 | 2 | 12 | 16 |
| 4 | 16 | add a0,a0,a1 |
ALU: 1+2 → a0 | 3 | 2 | 12 | 20 |
| 5 | 20 | jalr zero,ra,0 |
link: x0 ← PC+4 (discarded); ALU: ra+0 = 12+0 = 12; PCsel=target | 3 | 2 | 12 | 12 |
| 6 | 12 | unimp |
halt | 3 | 2 | 12 | – |
Trace the two control-flow cycles carefully:
- Cycle 3 (
jal):ALUSrcA=1puts PC (8) on the ALU A input;ALUSrcB=10putsimm-J(8) on B; the ALU adds to get 16. SimultaneouslyWDsel=1routesPC + 4(12) to the register file write port, andRFW=1writes it intora.PCsel=1makes the next PC the ALU target, 16. - Cycle 5 (
ret=jalr):ALUSrcA=0keepsra(12) on the ALU A input;ALUSrcB=01putsimm-I(0) on B; the ALU adds to get 12.WDsel=1would writePC + 4into the destination, but the destination isx0, so nothing observable changes.PCsel=1makes the next PC the ALU target, 12 — which is theunimpright after thejal. The function has returned.
The program ends with a0 = 3, the sum computed inside first_s, and the PC correctly back at the unimp after the call.
9. Common Mistakes¶
These are the failure modes that show up most often in the lab and in the autograder.
[1] PC update uses the wrong source.
Symptom: jal jumps to the right place but ret runs off into the weeds,
or vice-versa.
Cause: For jal the ALU A input must be PC; for jalr it must be RD0 (rs1).
A stuck ALUSrcA makes both jumps use the same A source.
Fix: ALUSrcA = 1 only for jal; = 0 for jalr (and everything else).
[2] Link writes the ALU result instead of PC+4.
Symptom: ret returns to the jump *target* address (a loop) instead of
the instruction after the call.
Cause: WDsel left at 0, so rd gets the ALU output (the target) not PC+4.
Fix: WDsel = 1 for both jal and jalr.
[3] PCsel never selects the target.
Symptom: jal "executes" but the PC still just advances by 4; the function
body is never entered.
Cause: PCsel = 0 for the jump rows, or the PCsel MUX inputs are swapped.
Fix: PCsel = 1 for jal and jalr; verify input 0 = PC+4, input 1 = target.
[4] Wrong immediate fed to the ALU.
Symptom: jal lands at a wrong (but plausible) address.
Cause: ALUSrcB selects imm-I for jal (should be imm-J) or vice-versa.
Fix: jal -> imm-J (ALUSrcB=10); jalr -> imm-I (ALUSrcB=01).
[5] Forgot to backfill new control columns for old instructions.
Symptom: addi/add suddenly break after you add the new MUXes.
Cause: ALUSrcA/WDsel/PCsel are "x" or 1 for the old rows, so an old
instruction accidentally jumps or links.
Fix: Set every new control output to 0 for addi and add.
[6] ROM word too narrow or split at the wrong bit positions.
Symptom: several control lines are wrong at once for every instruction.
Cause: ROM data width does not match the number of control bits, or the
output splitter's bit order disagrees with the spreadsheet.
Fix: ROM data bits = total control bits; match the splitter order to
the "Output bits" column exactly (low bit = leftmost column).
A good rule for #1, #2, and #3: when something jumps "almost right," the immediate or A-source is wrong; when a return goes wrong, suspect WDsel (you linked the wrong value).
10. Incremental Development and Debugging¶
The lab explicitly asks for an incremental approach with two committed top-level circuits, lab10-part1.dig and lab10-part2.dig. Here is the recommended workflow, which mirrors the 5-step pattern for adding any instruction.
flowchart TD
A["1. Pick instruction(s):<br/>jal then jalr"] --> B["2. Add components:<br/>PCsel, WDsel, ALUSrcA MUXes"]
B --> C["3. Extend datapath:<br/>wire PC, PC+4, imm-J"]
C --> D["4. Extend decoder:<br/>new rows + columns"]
D --> E["5. Test in Digital:<br/>single-step, watch dashboard"]
E -->|"bug"| C
E -->|"passes"| F["commit lab10-part2.dig"]
Practical debugging tips from the guides:
- Paste the
.digtest into your processor so you can run the autograder's test directly in Digital and watch where it diverges. - Single-step with
objdumpopen. Keep the disassembly beside the simulation so you can match each executed instruction to its expected effect. - Add probes/tunnels to the dashboard for
PC,PC+4,iw,RS1,imm-I,imm-J, the ALU result, andPCsel. When a jump misbehaves, these instantly show whether the immediate, the A-source, or the select line is at fault. - Use
ENto control start. Press play, pickPROG, then toggleENto 1 so the program starts cleanly with the PC at 0. - Watch
ra. After thejalcycle,ramust equal the address of theunimpafter the call. If it does not, you have aWDsel/link bug before you ever reach theret.
Key Concepts¶
| Concept | Definition | Example |
|---|---|---|
| Link register | Register that holds the return address saved by a call | ra (x1) after jal |
| Return address | Address of the instruction after a call (PC + 4) |
ra = 12 after jal at PC 8 |
| JAL | J-type jump-and-link; rd = PC+4, PC = PC + imm |
jal ra, first_s |
| JALR | I-type jump-and-link-register; rd = PC+4, PC = rs1 + imm |
jalr x0, ra, 0 (ret) |
| PC-relative | Target computed by adding offset to current PC | jal's PC + imm |
| Register-relative | Target computed from a register value | jalr's rs1 + imm |
| PCsel | MUX choosing PC+4 vs. jump target as next PC |
1 for jal/jalr |
| WDsel | MUX choosing ALU result vs. PC+4 as write data |
1 writes the link |
| ALUSrcA | MUX choosing RD0 vs. PC for ALU A input |
1 for jal (PC + imm) |
| ret | jalr x0, ra, 0: discard link, jump to ra |
function return |
Practice Problems¶
Problem 1: What does ret actually do?¶
ret is a pseudo-instruction. Write the full jalr it expands to, and explain why each of its operands has the value it does.
Click to reveal solution
- `rd = x0`: A return must not change the link register. Writing `PC + 4` into `x0` discards it, because `x0` always reads as zero. (The handwritten note: "always 0 for ret".) - `rs1 = ra`: The return address was saved in `ra` by the `jal`/`call` that invoked the function. The PC update is `PC = rs1 + imm`, and we want to jump back to that saved address. - `imm = 0`: We want to land *exactly* on the return address with no offset. (The handwritten note: "imm is 0 for ret".) Net effect: `PC = ra + 0 = ra`. Control returns to the instruction after the call.Problem 2: Compute a JAL target by hand¶
A jal ra, label instruction sits at byte address 0x40. Its J-type immediate decodes to imm = 0x20 (32 decimal). What value is written to ra, and what is the next PC?
Click to reveal solution
`jal` does two things: - **Link:** `ra = PC + 4 = 0x40 + 4 = 0x44` (68 decimal). - **Jump:** `PC = PC + imm = 0x40 + 0x20 = 0x60` (96 decimal). So `ra = 0x44` and the next instruction executed is at `0x60`. Note the target is computed from the *current* PC (`0x40`), not from `PC + 4`.Problem 3: Trace the registers¶
Hand-execute this program and give the final values of a0, a1, and ra. Assume main starts at address 0 and instructions are 4 bytes apart.
Click to reveal solution
Addresses: `li a0` @0, `li a1` @4, `jal` @8, `unimp` @12, `first_s add` @16, second `add` @20, `ret` @24. | Step | Instruction | Effect | a0 | a1 | ra | |------|-------------|--------|----|----|----| | 1 | `li a0,4` | a0 = 4 | 4 | – | – | | 2 | `li a1,5` | a1 = 5 | 4 | 5 | – | | 3 | `jal first_s` @8 | ra = 8+4 = 12; PC → 16 | 4 | 5 | 12 | | 4 | `add a0,a0,a1` | a0 = 4+5 = 9 | 9 | 5 | 12 | | 5 | `add a0,a0,a1` | a0 = 9+5 = 14 | 14 | 5 | 12 | | 6 | `ret` | PC → ra = 12 | 14 | 5 | 12 | | 7 | `unimp` @12 | halt | 14 | 5 | 12 | Final: `a0 = 14`, `a1 = 5`, `ra = 12`.Problem 4: Fill in the decoder row¶
You are adding jalr to your decoder. Given the control columns RFW | ALUOp | ALUSrcB | ALUSrcA | WDsel | PCsel, fill in the values for jalr and explain ALUSrcA and WDsel.
Click to reveal solution
- `RFW=1`, `WDsel=1`: write the link (`PC + 4`) into `rd`. (For `ret`, `rd = x0` so the write is discarded, but the control line is still 1.) - `ALUOp=000`: the ALU adds to form `rs1 + imm`. - `ALUSrcB=01`: select the I-type immediate (it is 0 for `ret`). - `ALUSrcA=0`: keep `RD0` (the value of `rs1`, e.g. `ra`) on the ALU A input — **not** PC. This is the key difference from `jal`, which uses `ALUSrcA=1`. - `PCsel=1`: take the computed target as the next PC.Problem 5: Diagnose the bug¶
A student's processor runs the Part 2 program but loops forever: after first_s runs, ret jumps back to first_s instead of to the unimp. Which control line is most likely wrong, and why?
Click to reveal solution
The likely culprit is **`WDsel`** for the `jal` row (a link bug), not anything in the `ret` row. If `jal`'s `WDsel = 0`, then during the `jal` cycle the register file's write data is the **ALU result** (the jump target, `first_s`'s address) instead of `PC + 4`. So `ra` ends up holding the address of `first_s`, not the address of the `unimp` after the call. Later, `ret` faithfully does `PC = ra + 0`. But `ra` is wrong — it points at `first_s` — so the processor re-enters the function and loops. Diagnosis tip: after the `jal` cycle, look at `ra` on the dashboard. It should equal the address of the instruction after the `jal` (the `unimp`). If it equals the function entry address, fix `WDsel` for `jal`.Problem 6: Why can one instruction implement both call and j?¶
Explain how jal serves as both the function-call primitive (call) and the unconditional-jump primitive (j), even though one needs a return address and the other does not.
Click to reveal solution
`jal rd, label` always does two things: it writes `PC + 4` into `rd` and sets `PC = PC + imm`. The flexibility is entirely in the choice of `rd`: - **`call label` = `jal ra, label`**: link into `ra`, so the return address is preserved and a later `ret` can come back. - **`j label` = `jal zero, label`**: link into `x0`. Since `x0` always reads as zero, writing `PC + 4` to it discards the return address. The result is a pure jump with no usable link. So the same datapath — same `WDsel`, same `PCsel`, same ALU `PC + imm` — implements both. The only difference is the destination register field in the instruction word, which the register-write logic already handles (a write to `x0` is a no-op). No extra hardware is needed to distinguish a call from a goto.Further Reading¶
- Lab 10 assignment spec: /assignments/lab10/
- Processor design guide (decoders, datapath, JAL/JALR): /guides/processor-part-2/
- Processor guide Part 1 (PC, instruction memory, ALU, register file): /guides/processor-part-1/
- Processor guide Part 3 (branches, data memory, dashboard): /guides/processor-part-3/
- RISC-V assembly reference: /guides/riscv/
- Course key concepts (instruction formats, pseudo-instructions): /guides/key-concepts/
- RISC-V Unprivileged ISA Specification
- Original handwritten notes: /notes/CS315-01 2025-11-05 Lab Processor JAL JALR.pdf
Summary¶
-
Functions need control flow plus a return address. A jump that forgets where it came from is a one-way trip; the link (saving
PC + 4) is what makes a return possible. -
jal(jump and link) is the call primitive. It is J-type:rd = PC + 4andPC = PC + imm(PC-relative).callandjare pseudo-instructions built on it. -
jalr(jump and link register) is the return primitive. It is I-type:rd = PC + 4andPC = rs1 + imm(register-relative).retisjalr x0, ra, 0— discard the link, jump tora. -
Both instructions link; they differ only in how the PC target is formed.
jaladds the immediate to PC;jalradds the immediate to a register. The same ALUaddcomputes both targets. -
The datapath grows by three MUXes.
PCselchoosesPC+4vs. target,WDselchooses ALU result vs.PC+4for the link, andALUSrcAchoosesRD0vs.PCfor the ALU A input;ALUSrcBwidens to also selectimm-J. -
The decoder spreadsheet adds two rows and three columns. Set the new control outputs to 0 for the existing
addi/addrows so Part 1 keeps working, and widen the control ROM to hold every control bit. -
Most bugs are control-line mix-ups. A bad
ALUSrcAorALUSrcBmakes a jump land in the wrong place; a badWDselmakesretreturn to the wrong address; a badPCselmakes the jump not happen at all. -
Develop incrementally and single-step with the dashboard. Commit
lab10-part1.digandlab10-part2.digseparately, keepobjdumpopen beside the simulation, and watchra, the PC, and the immediates to localize any failure fast.