Skip to content

RISC-V Assembly Part 2: Arguments, Arrays, Conditionals, and Loops

Overview

This lecture builds on our first pass at RISC-V assembly by organizing the instruction set into three categories — data processing, control, and memory — and then using them to express the programming constructs we already know from C. We learn how arrays live in memory and how to reach individual elements with load instructions, how to pass arrays and scalars as function arguments, and how to translate if/else statements and for/while loops into the conditional branches and unconditional jumps that the processor actually executes. These patterns are the core of Lab 3 and Project 2.

Learning Objectives

  • Classify RISC-V instructions into data processing, control, and memory categories
  • Explain how an array is laid out in memory and how a base address plus an offset reaches an element
  • Use the load word (lw) instruction with the offset(base) addressing syntax to read array elements
  • Pass scalar values and array base addresses as function arguments in the a registers and return a result in a0
  • Translate a C if/then/else statement into assembly using a conditional branch, a label, and an unconditional jump
  • Translate C for and while loops into assembly using a loop label, a guard branch, and a back-edge jump
  • Choose the correct branch instruction (beq, bne, blt, bge, ble, bge) for a given comparison
  • Trace assembly execution by hand and in a debugger to verify a computed result

Prerequisites

  • RISC-V Assembly Part 1: registers (x0x31 and ABI names), the fetch–decode–execute cycle, the program counter, and basic data-processing instructions
  • C fundamentals: arrays, pointers, if/else, for/while, and function arguments and return values (Project 1)
  • Comfort reading three-operand instructions of the form op dst, src1, src2
  • Familiarity with the dev environment (RISC-V VM, gcc, as, gdb)

1. The Three Categories of Instructions

Every RISC-V instruction you will use in this course falls into one of three families. Keeping these categories in mind makes a long instruction list feel small, because each new mnemonic is just a variation on a category you already understand.

Category Purpose Examples
Data processing Compute a new value from register values add, addi, sub, mul, div, slli, and, or
Control Change which instruction runs next j, beq, bne, blt, bge, ble, ret
Memory Move data between registers and memory lw, sw, ld, sd, lb, sb

The processor can only compute on values that live in registers. Data, however, lives in memory. So a typical computation has a recurring rhythm:

flowchart LR
    A[Memory] -->|load| B[Registers]
    B -->|data processing| B
    B -->|store| A

    style B fill:#f9f,stroke:#333,stroke-width:2px
  1. Load values from memory into registers (memory instruction).
  2. Compute a new value from those registers (data processing instruction).
  3. Store the result back to memory if needed (memory instruction).
  4. Control instructions decide whether to repeat, branch, or move on.

Today we focus on the memory category (specifically array access) and the control category (conditionals and loops), tying everything together with the arguments and return value conventions that let one piece of code call another.


2. Arrays in Memory

What an Array Is

In C, an array is a contiguous block of memory holding elements of the same type. Consider:

int arr[3] = {1, 2, 3};

This reserves room for three int values. On our 64-bit RISC-V machine an int is 4 bytes (a word), so arr occupies 12 bytes laid out one element after another. Memory is byte-addressable: every byte has its own address, and a 4-byte word spans four consecutive addresses.

The name arr evaluates to the address of the first element — its base address. That is the key idea that connects C and assembly: an array argument is really just a pointer (an address) to the start of the data.

Processor                                Memory
+--------------------+                   +-----------+
|  Registers         |                   | ...       |
|  +--------------+  |        arr + 8 -> |     3     |  arr[2]
|  | a0  (base)   |--+-------+           +-----------+
|  | a1           |  |       |  arr + 4 -> |   2     |  arr[1]
|  | a2           |  |       |           +-----------+
|  +--------------+  |       +-> arr  -> |     1     |  arr[0]
+--------------------+                   +-----------+

The instructor's diagram showed exactly this: the base address arr is held in a register (a0), and following that address into memory reaches the first element, 1. Adding 4 reaches 2, adding 8 reaches 3.

Element Addresses

To find the address of element i in an int array, multiply the index by the element size and add it to the base:

address of arr[i] = base + (i * sizeof(int))
                  = base + (i * 4)
Element Index i Byte offset (i * 4) Address
arr[0] 0 0 base + 0
arr[1] 1 4 base + 4
arr[2] 2 8 base + 8
arr[3] 3 12 base + 12

Because int is 4 bytes, consecutive elements are 4 bytes apart. For an array of long or pointers (8 bytes each) the stride would be 8.


3. Memory Instructions: Load and Store

Memory instructions are the only way to move data between registers and memory. They come in matched pairs: load (memory → register) and store (register → memory).

Load Word

The instruction taught in class is lw, load word, which reads a 32-bit value from memory into a register:

lw   t0, (a0)        # t0 = *a0  -- load the word at address a0 into t0
     ^    ^
     |    +-- addr: register holding the memory address
     +------- dest: register that receives the loaded value

Reading the operands aloud: "load the word (32-bit value) found at the address in a0 into the destination register t0." In C this is exactly a pointer dereference, t0 = *a0;.

The parentheses mean "use the value in this register as a memory address." This is register-indirect addressing — a0 is not the data, it is a pointer to the data.

The offset(base) Addressing Syntax

You can add a constant offset to the base address directly in the instruction:

lw   t0, 0(a0)       # t0 = word at address a0 + 0   -> arr[0]
lw   t1, 4(a0)       # t1 = word at address a0 + 4   -> arr[1]
lw   t2, 8(a0)       # t2 = word at address a0 + 8   -> arr[2]

The effective address is computed as base + offset. Writing (a0) is shorthand for 0(a0). This makes reading the first few fixed elements of an array very clean.

Load vs Store Operand Order

Stores write a register value back to memory. The store-word instruction is sw. Watch the operand order — it is a frequent source of bugs:

# Load: the DESTINATION register comes first
lw   t0, (a0)        # t0 = memory[a0]   (t0 receives)

# Store: the SOURCE register comes first
sw   t0, (a0)        # memory[a0] = t0   (t0 provides)

Loads follow the usual "destination first" convention (like add rd, rs1, rs2). Stores put the value being written first and the address second.

Sizes

Because the machine is 64-bit, there are wider memory instructions too. lw/sw move 32-bit words (a C int); ld/sd move 64-bit doublewords (a C long or a pointer). For Lab 3 the arrays are int arrays, so lw is the instruction you need.

Instruction Bytes C type Operation
lw rd, off(rs) 4 int load 32-bit word
sw rs2, off(rs1) 4 int store 32-bit word
ld rd, off(rs) 8 long, pointer load 64-bit doubleword
sd rs2, off(rs1) 8 long, pointer store 64-bit doubleword

4. Passing Arrays and Scalars as Arguments

The Argument and Return Convention

RISC-V uses registers to pass arguments and return results:

  • Arguments go into a0, a1, a2, ..., a7, in order.
  • The return value comes back in a0.

For Lab 3 we restrict ourselves to the a registers (arguments and return) and the t registers (temporaries) — no stack management is required because these functions do not call other functions. (Saving registers across calls is the topic of the next session.)

A scalar argument is just placed in a register. An array argument is passed as its base address — a single pointer — not by copying the whole array.

// C: arr decays to a pointer to its first element
int sum3(int *arr) {
    return arr[0] + arr[1] + arr[2];
}
.global sum3
# int sum3(int *arr)
# a0 = arr (base address of the array)

sum3:
    lw   t0, 0(a0)       # t0 = arr[0]
    lw   t1, 4(a0)       # t1 = arr[1]
    lw   t2, 8(a0)       # t2 = arr[2]
    add  t0, t0, t1      # t0 = arr[0] + arr[1]
    add  a0, t0, t2      # a0 = t0 + arr[2]  (return value in a0)
    ret

Two things to notice:

  1. We loaded each element with a fixed offset because the indices (0, 1, 2) are known at write time.
  2. The final result is placed in a0 before ret, because a0 is the return register.

Two Ways to Add Three Numbers

In class we contrasted two versions of the same "add three numbers" task to highlight the difference between scalar and memory arguments.

Version A — three scalar arguments. Each value arrives in its own register, so no memory access is needed:

.global add3
# int add3(int a, int b, int c)
# a0 = a, a1 = b, a2 = c

add3:
    add  a0, a0, a1      # a0 = a + b
    add  a0, a0, a2      # a0 = (a + b) + c
    ret

Version B — one array argument. Only the base address arrives; we must lw each element out of memory (the sum3 example above). Same result, but now the memory category does the heavy lifting. The point: how data is passed determines which instructions you reach for.

Indexed Array Access (Variable Index)

When the index is not a constant — for example inside a loop — you compute the byte offset at run time. Assume a0 holds the base address and t0 holds the index i:

# t2 = arr[i]   where a0 = base, t0 = i
li   t1, 4           # element size = 4 bytes
mul  t1, t0, t1      # t1 = i * 4
add  t1, a0, t1      # t1 = base + i*4 = &arr[i]
lw   t2, (t1)        # t2 = arr[i]

A faster, idiomatic variant replaces the multiply with a shift left, since multiplying by 4 is the same as shifting left by 2 (4 == 2^2):

# t2 = arr[i]   using a shift instead of a multiply
slli t1, t0, 2       # t1 = i * 4   (shift left logical by 2)
add  t1, a0, t1      # t1 = &arr[i]
lw   t2, (t1)        # t2 = arr[i]
slli shift Multiplies by Element size
slli x, y, 1 2 2-byte (short)
slli x, y, 2 4 4-byte (int)
slli x, y, 3 8 8-byte (long, pointer)

5. Control Statements: Branches and Jumps

Two Kinds of Control Transfer

C control structures are built from just two assembly primitives:

  • Unconditional jump (j label): always transfers control to label.
  • Conditional branch (b__ rs1, rs2, label): transfers control to label only if a comparison of rs1 and rs2 holds; otherwise execution falls through to the next instruction.

A label names a location in the code so that a jump or branch has somewhere to go.

main:
    add  t0, zero, zero  # t0 = 0
    j    next            # unconditional jump -- skip the two adds
    add  ...             # (skipped)
    add  ...             # (skipped)
next:
    addi t0, zero, 1     # execution resumes here

The red arrow in the instructor's notes traced the jump from j next down to the next: label, skipping the instructions in between. That is the whole idea of a jump: it rewrites the program counter so the next instruction fetched is the one at the label, not the one physically below.

The Branch Instructions

A conditional branch compares two registers and branches if the condition is true. The mnemonic encodes the comparison:

Instruction Branch taken when Comparison
beq rs1, rs2, label rs1 == rs2 equal
bne rs1, rs2, label rs1 != rs2 not equal
blt rs1, rs2, label rs1 < rs2 less than (signed)
bge rs1, rs2, label rs1 >= rs2 greater or equal (signed)
ble rs1, rs2, label rs1 <= rs2 less than or equal (signed)
bgt rs1, rs2, label rs1 > rs2 greater than (signed)

In class we read ble aloud as "branch on less than or equal": ble a0, a1, else means "if a0 <= a1, go to else."

Signed comparisons and negative numbers

blt, bge, ble, and bgt interpret their operands as signed two's-complement integers, so they handle negative values correctly. There are unsigned variants (bltu, bgeu) for when bit patterns should be compared as unsigned magnitudes. For Lab 3's signed int data, the signed branches are what you want.


6. Translating If/Then/Else

The Pattern

The instructor worked this example on the board. In C:

int val, r;

if (val > 0) {
    r = 1;
} else {
    r = 0;
}

The assembly assumes a0 holds val and t1 holds r:

# a0 - int val
# t1 - int r

    ble  a0, zero, else      # if val <= 0, go to the else branch
    addi t1, zero, 1         # then-branch:  r = 1
    j    done                # skip over the else-branch
else:
    addi t1, zero, 0         # else-branch:  r = 0
done:

The Key Trick: Branch on the Opposite Condition

The C code says "if val > 0, do the then-block." The assembly branches on the opposite condition: ble a0, zero, else jumps away to the else-block when val <= 0. When the condition is true we simply fall through into the then-block.

This inversion is the heart of compiling if. We arrange the code so that:

  1. The branch skips the then-block when the condition is false.
  2. The then-block ends with an unconditional j done so it does not fall into the else-block.
  3. The else-block sits between the else: and done: labels.
C condition (do then-block when true) Branch to else when false
val > 0 ble a0, zero, else
val >= 0 blt a0, zero, else
a == b bne a0, a1, else
a != b beq a0, a1, else
a < b bge a0, a1, else
a <= b bgt a0, a1, else

Control Flow Diagram

flowchart TD
    A["Evaluate condition (val > 0?)"] -->|"false: ble taken"| E["else block: r = 0"]
    A -->|"true: fall through"| T["then block: r = 1"]
    T --> J["j done"]
    J --> D["done:"]
    E --> D

    style T fill:#9f9,stroke:#333
    style E fill:#ff9,stroke:#333

A Second If Example

The notes also sketched a simple equality test:

if (x == 0) {
    y = 1;
}

With a0 holding x and t0 holding y:

# a0 - int x, t0 - int y
    bne  a0, zero, done      # if x != 0, skip the body
    li   t0, 1               # y = 1
done:

There is no else here, so there is no j — when the body finishes it simply falls through to done:. The branch bne carries the inverted condition: we skip the body whenever x != 0.


7. Translating Loops

The Loop Pattern

A loop is just an if whose body ends by jumping back up to re-test the condition. The instructor built the canonical pattern from a guard branch plus a back-edge jump.

The C code, a function that sums 0 + 1 + ... + (n-1):

int loopsum(int n) {
    int i;
    int sum = 0;
    for (i = 0; i < n; i++) {
        sum = sum + i;
    }
    return sum;
}

The assembly, with a0 holding n (the argument), t0 holding i, and t1 holding sum:

.global loopsum
# int loopsum(int n)
# a0 - int n
# t0 - int i
# t1 - int sum

loopsum:
    li   t0, 0           # i = 0
    li   t1, 0           # sum = 0
loop:
    bge  t0, a0, done    # if i >= n, exit the loop
    add  t1, t1, t0      # sum = sum + i
    addi t0, t0, 1       # i = i + 1
    j    loop            # jump back to re-test the condition
done:
    mv   a0, t1          # move sum into a0 (return value)
    ret

Anatomy of the Loop

    initialize counters         <- li t0, 0  /  li t1, 0
loop:
    GUARD: bge t0, a0, done     <- exit when i >= n  (inverted condition)
    body:  sum = sum + i        <- the work
    update: i = i + 1           <- advance the counter
    back-edge: j loop           <- repeat
done:
    return sum

Mapping the C for clauses to assembly:

for clause C Assembly
Initialization i = 0 li t0, 0 (before loop:)
Condition i < n bge t0, a0, done (inverted: exit when i >= n)
Body sum = sum + i add t1, t1, t0
Update i++ addi t0, t0, 1
Repeat (implicit) j loop

Just as with if, the guard branches on the opposite of the loop condition: the C loop continues while i < n, so the assembly exits when i >= n (bge).

Loop Control Flow Diagram

flowchart TD
    Init["i = 0, sum = 0"] --> Guard{"i < n?"}
    Guard -->|"yes"| Body["sum = sum + i"]
    Body --> Update["i = i + 1"]
    Update --> Guard
    Guard -->|"no"| Done["return sum"]

    style Body fill:#9f9,stroke:#333
    style Done fill:#9ff,stroke:#333

mv and li Are Pseudo-Instructions

The epilogue used mv a0, t1 to copy sum into the return register, and li to load constants. These are convenience pseudo-instructions the assembler expands into real instructions:

  • li t0, 0 becomes addi t0, zero, 0
  • mv a0, t1 becomes addi a0, t1, 0 (or add a0, t1, zero)
  • j loop becomes jal zero, loop

They make code far more readable while still mapping to the base instruction set.

Verifying in the Debugger

In class we ran the loop under gdb, single-stepping to watch t0 (i) and t1 (sum) change each iteration. For loopsum(4) the trace is:

Iteration t0 (i) before guard Guard i >= 4? t1 (sum) after body
1 0 no 0 + 0 = 0
2 1 no 0 + 1 = 1
3 2 no 1 + 2 = 3
4 3 no 3 + 3 = 6
exit 4 yes → done 6 (returned in a0)

A handy gdb workflow:

# Build with debug info, then step through the function
riscv64-linux-gnu-gcc -g -static -o loopsum main.c loopsum_s.s
gdb ./loopsum
(gdb) break loopsum
(gdb) run 4
(gdb) stepi              # step one instruction at a time
(gdb) info registers t0 t1 a0   # inspect i, sum, return value

8. Putting It Together: Find the Maximum

The Lab 3 / Project 2 findmax problem combines everything in this lecture: an array argument, indexed memory access in a loop, and a conditional inside the body. Here is the array-traversal version (a single function, no calls).

int findmax(int *arr, int n) {
    int max = arr[0];
    int i;
    for (i = 1; i < n; i++) {
        if (arr[i] > max) {
            max = arr[i];
        }
    }
    return max;
}
.global findmax
# int findmax(int *arr, int n)
# a0 - int *arr (base address)
# a1 - int n
# t0 - int i
# t1 - int max
# t2 - &arr[i] / arr[i]

findmax:
    lw   t1, (a0)        # max = arr[0]
    li   t0, 1           # i = 1
loop:
    bge  t0, a1, done    # if i >= n, exit
    slli t2, t0, 2       # t2 = i * 4
    add  t2, a0, t2      # t2 = &arr[i]
    lw   t2, (t2)        # t2 = arr[i]
    ble  t2, t1, skip    # if arr[i] <= max, skip update
    mv   t1, t2          # max = arr[i]
skip:
    addi t0, t0, 1       # i++
    j    loop
done:
    mv   a0, t1          # return max
    ret

Notice the two inverted conditions working together: the loop guard exits when i >= n (bge), and the inner if skips the update when arr[i] <= max (ble). The Project 2 findmaxfc variant performs the same comparison by calling a max2_s helper instead of branching inline — which requires the stack discipline covered next session.


Key Concepts

Concept Definition Example
Data processing instruction Computes a value from registers add a0, a1, a2
Control instruction Changes which instruction runs next j loop, ble a0, a1, else
Memory instruction Moves data between registers and memory lw t0, (a0), sw t0, 4(a0)
Base address Address of an array's first element arr in int arr[3]
offset(base) syntax Effective address = base + offset lw t1, 4(a0) reads base + 4
Load word (lw) Read a 32-bit value from memory into a register lw t0, (a0) is t0 = *a0
Argument registers a0a7, hold function inputs a0 = arr, a1 = n
Return register a0, holds the function result mv a0, t1 before ret
Label A name for a code location loop:, done:
Unconditional jump Always transfers control j loop
Conditional branch Transfers control if a comparison holds bge t0, a0, done
Condition inversion Branch on the opposite condition to skip a block if (x>0)ble x, zero, else
slli for indexing Shift left to multiply index by element size slli t1, t0, 2 is t0 * 4
Pseudo-instruction Assembler convenience expanded to real ops mv, li, j

Practice Problems

Problem 1: Read the Third Element

The base address of an int array is in a0. Write the single instruction that loads arr[3] into t0 using a constant offset, and explain the offset you chose.

Click to reveal solution
lw   t0, 12(a0)      # t0 = arr[3]
Each `int` is 4 bytes, so element 3 is at byte offset `3 * 4 = 12` from the base. The `offset(base)` syntax computes the effective address `a0 + 12` and loads the 32-bit word there.

Problem 2: Indexed Access with a Variable

The base address is in a0 and the index i is in a1. Write assembly that loads arr[i] into t0 using a shift (not a multiply).

Click to reveal solution
slli t1, a1, 2       # t1 = i * 4   (shift left by 2)
add  t1, a0, t1      # t1 = &arr[i] = base + i*4
lw   t0, (t1)        # t0 = arr[i]
Shifting left by 2 multiplies by `2^2 = 4`, the size of an `int`. Adding that byte offset to the base produces the element address, and `lw` dereferences it.

Problem 3: Translate an If/Else

Translate this C code to RISC-V assembly. Assume a0 holds x and t0 holds r.

if (x < 10) {
    r = 100;
} else {
    r = 200;
}
Click to reveal solution
# a0 - int x, t0 - int r
    li   t1, 10
    bge  a0, t1, else        # if x >= 10, go to else (inverted condition)
    li   t0, 100             # then: r = 100
    j    done
else:
    li   t0, 200             # else: r = 200
done:
The C `if` runs the then-block when `x < 10`, so the assembly branches to `else` on the opposite condition, `x >= 10` (`bge`). Because `bge` compares two registers, we first load the constant `10` into `t1`. The then-block ends with `j done` so it does not fall through into the else-block.

Problem 4: Trace a Loop

Hand-trace the following for the call loopsum(3). Show the value of i and sum at the end of each iteration and the final returned value.

loopsum:
    li   t0, 0           # i
    li   t1, 0           # sum
loop:
    bge  t0, a0, done    # a0 = n
    add  t1, t1, t0
    addi t0, t0, 1
    j    loop
done:
    mv   a0, t1
    ret
Click to reveal solution With `a0 = n = 3`: | Iteration | `i` at guard | `i >= 3`? | `sum` after body | `i` after update | |-----------|--------------|-----------|------------------|------------------| | 1 | 0 | no | 0 + 0 = 0 | 1 | | 2 | 1 | no | 0 + 1 = 1 | 2 | | 3 | 2 | no | 1 + 2 = 3 | 3 | | exit | 3 | yes → done | 3 | — | The returned value in `a0` is **3** (which is `0 + 1 + 2`). The body never runs for `i = 3` because the guard `bge t0, a0, done` exits first.

Problem 5: Sum an Array

Write a complete leaf function sumarr(int *arr, int n) that returns the sum of all n elements. Use a0 for the base, a1 for n, and the t registers for locals.

Click to reveal solution
.global sumarr
# int sumarr(int *arr, int n)
# a0 - int *arr, a1 - int n
# t0 - i, t1 - sum, t2 - &arr[i] / arr[i]

sumarr:
    li   t0, 0           # i = 0
    li   t1, 0           # sum = 0
loop:
    bge  t0, a1, done    # if i >= n, exit
    slli t2, t0, 2       # t2 = i * 4
    add  t2, a0, t2      # t2 = &arr[i]
    lw   t2, (t2)        # t2 = arr[i]
    add  t1, t1, t2      # sum += arr[i]
    addi t0, t0, 1       # i++
    j    loop
done:
    mv   a0, t1          # return sum
    ret
This is the array-traversal template: a loop guard on `i >= n`, indexed load with `slli`/`add`/`lw`, an accumulate step in the body, a counter update, and a back-edge `j loop`. The accumulated `sum` is moved to `a0` before `ret`.

Problem 6: Why Invert the Condition?

A student writes the following for if (a0 > 0) { t0 = 1; }. Explain what is wrong and fix it.

    bgt  a0, zero, body
body:
    li   t0, 1
done:
Click to reveal solution The branch jumps to `body` only when `a0 > 0`, but `body` is the very next instruction, so the branch does nothing useful — and worse, when `a0 <= 0` execution **falls through** into `body` anyway and still sets `t0 = 1`. The condition was not inverted. To compile `if`, branch on the **opposite** condition to *skip* the body:
    ble  a0, zero, done      # if a0 <= 0, skip the body
    li   t0, 1               # body: t0 = 1
done:
Now the body runs only when `a0 > 0`, exactly matching the C `if`.

Further Reading

  • RISC-V ISA Specification — the official standard
  • RISC-V Assembly Programmer's Manual — practical reference for instructions and pseudo-instructions
  • The RISC-V Reader — Patterson and Waterman, a concise textbook
  • RISC-V references and cheat sheet: /guides/riscv/
  • Course key concepts index: /guides/key-concepts/
  • Project 2 (RISC-V Assembly Language): /assignments/project02/
  • Source lecture notes (PDF): "/notes/CS315-01 2025-09-04 RISC-V Assembly 2.pdf"

Summary

  1. Every instruction fits one of three categories — data processing (compute), control (change the next instruction), and memory (move data between registers and memory). Computation follows a load → compute → store rhythm.

  2. Arrays are contiguous memory, and an array argument is passed as its base address. Element i of an int array lives at base + i * 4, because int is 4 bytes and memory is byte-addressable.

  3. lw loads a word from memory using the offset(base) syntax: lw t0, 4(a0) reads the 32-bit value at a0 + 4. Loads name the destination first; stores (sw) name the source first.

  4. Arguments arrive in a0a7 and results return in a0. For Lab 3, code uses only a and t registers because the functions do not call other functions, so no stack management is needed.

  5. Control flow is built from unconditional jumps (j) and conditional branches (beq, bne, blt, bge, ble, bgt), with labels naming jump targets.

  6. Compiling if/else relies on condition inversion: branch on the opposite of the C condition to skip the then-block, end the then-block with j done, and place the else-block before done:.

  7. Loops are an if with a back-edge: initialize, test the inverted condition with a guard branch, run the body, update the counter, and j back to the guard label.

  8. Verify by tracing — by hand with a register table or in gdb with stepi — to confirm the loop counter, accumulator, and return value evolve as expected.