Skip to content

RISC-V Assembly Part 6: Byte Order, Two's Complement, and Strings

Overview

This lecture connects high-level C programs to their underlying memory representation. We treat memory as a byte-addressable array, examine how a multi-byte integer is laid out in memory (endianness), and explain why almost every modern machine uses two's complement to represent signed integers. We then look at how to convert between positive and negative values, how to widen a value with sign extension and narrow it with truncation, and finish with C strings as null-terminated arrays of character bytes and the RISC-V load/store instructions used to walk them one byte at a time. These ideas are the foundation for Project 3, where you manipulate strings and structs directly in RISC-V assembly.

Learning Objectives

  • Describe the memory model used by a processor: registers, the program counter, and RAM partitioned into stack, heap, data, and code
  • Explain that memory is a byte-addressable array and that a byte is 8 bits
  • Distinguish big-endian from little-endian byte ordering and identify which order RISC-V uses
  • Compare sign-magnitude, one's complement, and two's complement, and explain why two's complement won
  • Convert a positive value to its negative two's complement representation (and back) using invert(v) + 1
  • Widen a value to more bits with sign extension and narrow it with truncation
  • Represent C strings as null-terminated byte arrays and access individual characters with lb/sb
  • Use the correct load width (lb, lw, ld) for the data type you are accessing

Prerequisites

  • RISC-V registers, instructions, and the fetch/decode/execute cycle (Assembly Parts 1–5)
  • The RISC-V calling convention: argument registers a0a7, return value in a0, the stack pointer sp
  • Memory instructions: load and store with an optional offset, e.g. lw a1, 0(a2)
  • Binary and hexadecimal number systems and base conversion in C
  • C pointers: & (address-of), * (dereference), pointer arithmetic, and type casts

1. The Memory Model: Processor and RAM

Before we can talk about how data is laid out, we need a clear picture of the two halves of a computer and how they communicate.

flowchart LR
    subgraph CPU["Processor (CPU)"]
        REGS["Registers<br/>(x0–x31)"]
        PC["PC<br/>(program counter)"]
        EXEC["Execute<br/>instructions"]
    end

    subgraph RAM["Memory (RAM)"]
        STACK["STACK"]
        HEAP["HEAP"]
        DATA["DATA"]
        CODE["CODE"]
    end

    REGS -- "store" --> RAM
    RAM -- "load" --> REGS
    PC -- "fetch instruction" --> CODE

The processor can only compute on values that are held in its registers. Memory (RAM) is where both the program's code (machine instructions) and its data live. Two activities cross the boundary between CPU and memory:

  • Load / store: data moves between registers and memory. A load copies bytes from memory into a register; a store copies bytes from a register out to memory.
  • Instruction fetch: the program counter (PC) holds the address of the next instruction. The processor reads the instruction at PC out of the code region, decodes it, executes it, and then advances PC (usually PC + 4 for the next instruction, or a branch/jump target).

RAM is conventionally drawn as a single tall column partitioned into regions. From high addresses down to low addresses:

Region What it holds Grows
Stack Function call frames, local variables, saved registers Downward (toward lower addresses)
Heap Dynamically allocated memory (malloc) Upward (toward higher addresses)
Data Global and static variables Fixed
Code The machine instructions of your program Fixed

The stack grows down and the heap grows up so that they can share the large gap of unused addresses between them. The key takeaway for this lecture is the layer below all of this: regardless of region, memory is just a long array of bytes.


2. Memory Is a Byte-Addressable Array

The smallest unit of memory that has its own address is the byte (8 bits). We say memory is byte addressable: every byte has a unique numeric address, starting at 0 and counting up. You can picture memory as one giant uint8_t array:

addr:  ...  14  13  12  11  10   9   8   7   6   5   4   3   2   1   0
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
byte:  |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
                                    ^
                                  1 byte = 8 bits

A single byte can only represent 256 distinct values (0x000xFF). Most data we care about is larger than a byte, so larger values must occupy several consecutive bytes:

C type Bits Bytes RISC-V load
char / uint8_t 8 1 lb / lbu
short 16 2 lh / lhu
int / uint32_t 32 4 lw / lwu
long / pointer / uint64_t 64 8 ld

This immediately raises a question: if an int is 4 bytes and memory addresses individual bytes, in what order do those 4 bytes get stored? That is the problem of byte ordering, or endianness.


3. Endianness: Big-Endian vs Little-Endian

Consider this declaration:

int x = 0xFFAA1122;

The 32-bit value 0xFFAA1122 is made of four bytes. From most significant to least significant they are:

0xFF  0xAA  0x11  0x22
 |     |     |     |
MSB                LSB

The value occupies 4 consecutive byte addresses, but there are two reasonable conventions for which byte goes at the lowest address. Assume &x points at address 8 and the four bytes occupy addresses 8, 9, 10, 11.

Big-endian stores the most significant byte first (at the lowest address):

addr   big-endian
 11  |  22 |   <- LSB
 10  |  11 |
  9  |  AA |
  8  |  FF |   <- MSB, this is &x

Little-endian stores the least significant byte first (at the lowest address):

addr   little-endian
 11  |  FF |   <- MSB
 10  |  AA |
  9  |  11 |
  8  |  22 |   <- LSB, this is &x

Putting them side by side, reading from low address (&x) upward:

Address Big-endian Little-endian
&x + 0 (8) 0xFF (MSB) 0x22 (LSB)
&x + 1 (9) 0xAA 0x11
&x + 2 (10) 0x11 0xAA
&x + 3 (11) 0x22 (LSB) 0xFF (MSB)
flowchart LR
    subgraph V["int x = 0xFFAA1122"]
        direction TB
        msb["MSB 0xFF | 0xAA | 0x11 | 0x22 LSB"]
    end
    V --> BE["Big-endian:<br/>FF at &x,<br/>then AA, 11, 22"]
    V --> LE["Little-endian:<br/>22 at &x,<br/>then 11, AA, FF"]

Two important facts:

  • RISC-V (and x86) are little-endian. The byte at the lowest address is the least significant byte.
  • There is no universal standard. Different architectures chose differently. The mnemonic comes from Gulliver's Travels: the Lilliputians went to war over which end of a soft-boiled egg to crack — the big end or the little end. The choice is arbitrary, but everyone on a given machine must agree.

Why Endianness Matters

Inside a single machine, endianness is invisible: you store an int and load it back and get the same value. It becomes visible in two situations:

  1. Byte-level access. If you cast an int * to a char * and read byte 0, which byte you get depends on endianness (see Section 4).
  2. Networking. When two machines exchange raw bytes, a little-endian sender and a big-endian receiver will disagree on what a multi-byte field means. Network protocols therefore define a canonical network byte order (big-endian), and code converts to/from host order. TCP/IP handles reliable, in-order delivery of the bytes; agreeing on their interpretation is a separate, application-level concern.

Inspecting Bytes in C

We can prove the machine is little-endian by reading the first byte of an int:

#include <stdio.h>
#include <stdint.h>

int main(void) {
    int x = 0xFFAA1122;
    uint8_t *p = (uint8_t *)&x;   // treat the int as an array of bytes

    printf("p[0] = 0x%02X\n", p[0]);   // 0x22 on little-endian
    printf("p[1] = 0x%02X\n", p[1]);   // 0x11
    printf("p[2] = 0x%02X\n", p[2]);   // 0xAA
    printf("p[3] = 0x%02X\n", p[3]);   // 0xFF

    if (p[0] == 0x22)
        printf("little-endian\n");
    else
        printf("big-endian\n");
    return 0;
}

Note how the cast works: &x is an int * (the address of a 4-byte value). Casting it to uint8_t * reinterprets that same address as the start of a byte array. The pointer value (the address) does not change — only how many bytes we read when we dereference does. That is the central idea of pointer casting: a pointer always holds a 64-bit address; the pointed-to type decides the access size.

You can confirm the same thing in GDB by examining memory:

(gdb) p &x
$1 = (int *) 0x7fffffffe45c
(gdb) x/4xb &x
0x7fffffffe45c: 0x22  0x11  0xaa  0xff

x/4xb means "examine 4 values, in hex, byte-sized." The bytes come out in little-endian order.


4. Binary Representation of Signed Integers

Unsigned binary is straightforward: 0b1010 is 8 + 2 = 10. But how do we represent negative numbers when all we have are 0s and 1s? In C, integer types are signed by default (int means signed int), and the representation that almost every machine uses is two's complement.

To compare the candidates, here is the full table of 4-bit patterns under three schemes. (With 4 bits there are 16 patterns.)

Unsigned Decimal Binary Sign-Magnitude Two's Complement
0 0000 0 0
1 0001 1 1
2 0010 2 2
3 0011 3 3
4 0100 4 4
5 0101 5 5
6 0110 6 6
7 0111 7 7
8 1000 -0 -8
9 1001 -1 -7
10 1010 -2 -6
11 1011 -3 -5
12 1100 -4 -4
13 1101 -5 -3
14 1110 -6 -2
15 1111 -7 -1

Sign-Magnitude

The simplest idea: use the most significant bit as a sign flag (0 = positive, 1 = negative) and the remaining bits as the magnitude. So 0101 is +5 and 1101 is -5.

Sign-magnitude has two fatal problems:

  1. Two zeros. 0000 is +0 and 1000 is -0. Wasting a pattern, and equality checks (x == 0) become awkward.
  2. Arithmetic does not "just work." Adding a positive and a negative number with plain binary addition gives the wrong answer. Watch:
     0 1 0 1   (+5)
  +  1 0 1 1   (-3 in sign-magnitude)
  ---------
   1 0 0 0 0   = 0 (after dropping carry)   WRONG: 5 + (-3) should be 2

The hardware would need special-case logic (or lookup tables) to do signed addition. That does not scale.

Two's Complement

Two's complement keeps the MSB as a sign indicator (1 means negative) but assigns the negative values cleverly so that ordinary binary addition produces correct results. With the same example:

        0 1 0 1   (+5)
     +  1 1 0 1   (-3 in two's complement)
     ---------
      1 0 0 1 0   drop the carry out of the top:
        0 0 1 0   = 2          CORRECT: 5 + (-3) = 2

The same grade-school add-with-carry circuit handles both positive and negative numbers. Two's complement wins because:

  • Only one representation of zero (0000). This is why the negative range extends one further than the positive range: 4 bits give -8 .. +7, not -7 .. +7.
  • The MSB is still the sign bit (0 = non-negative, 1 = negative), so checking the sign is one bit test.
  • Addition, subtraction, and multiplication use the same hardware as unsigned — no special cases, no lookup tables. This scales to any bit width.

This is the reason "binary representation of integers, implied to be signed, means two's complement" on essentially every modern computer.


5. Converting Between Positive and Negative

The mechanical recipe to negate a two's complement value is:

negate(v) = invert(v) + 1

That is: flip every bit (the bitwise NOT / one's complement), then add 1.

Positive to Negative: 3 to -3 (4 bits)

  3  =  0011
         |
  invert:  1100      (flip every bit)
  + 1:     1100 + 1 = 1101
         |
 -3  =  1101

So -3 in 4-bit two's complement is 1101. Check against the table above — row 13 (1101) is indeed -3.

Negative to Positive: -3 back to 3

The beauty of two's complement is that the same operation (invert + 1) converts in the other direction too:

 -3  =  1101
         |
  invert:  0010
  + 1:     0010 + 1 = 0011
         |
  3  =  0011

invert(v) + 1 is its own inverse, so you only ever need to learn one procedure.

In C

#include <stdio.h>
#include <stdint.h>

int main(void) {
    int8_t v = 3;
    int8_t neg = ~v + 1;          // invert bits, add one
    printf("%d\n", neg);          // -3
    printf("0x%02X\n", (uint8_t)neg);   // 0xFD  (1111 1101 in 8 bits)

    // The compiler does the same thing for unary minus:
    printf("%d\n", -v);           // -3
    return 0;
}

The unary - operator in C is exactly this invert + 1 operation under the hood.


6. Changing Bit Width: Sign Extension and Truncation

A value is stored in some fixed number of bits, but we often need to move it into a wider or narrower container — for example, loading a 4-bit or 8-bit value into a 64-bit register. The rule is: going from n bits to m bits where m > n, we must preserve both the value and its sign.

Widening: Sign Extension

To widen a two's complement value, copy the original sign bit (the MSB) into all the new high bits. This is sign extension.

Negative example — 1101 (-3) from 4 bits to 8 bits:

The sign bit is 1, so fill the new upper bits with 1:

   4 bits:        1 1 0 1            (-3)
                  ^ sign bit = 1
   sign-extend ->
   8 bits:    1 1 1 1 1 1 0 1        (-3)

You can verify this is still -3: invert 11111101 to 00000010, add 1, get 00000011 = 3, so the original was -3. The value is preserved.

Positive example — 0011 (3) from 4 bits to 8 bits:

The sign bit is 0, so fill the new upper bits with 0:

   4 bits:        0 0 1 1            (3)
                  ^ sign bit = 0
   sign-extend ->
   8 bits:    0 0 0 0 0 0 1 1        (3)

Visually, sign extension "drags" the top bit leftward across all the new positions:

   (1) 1 0 1          <- top bit replicated...
    | | | |
    v v v v
  1 1 1 1 1 1 0 1     (-3, 8 bits)

Equivalently, the same value -3 shown at 4, 8, and 64 bits:

Width Bits (-3) Hex
4 1101
8 1111 1101 0xFD
64 1111…1111 1101 0xFFFFFFFFFFFFFFFD

This is exactly why Project 3's unstruct prints -99 as 0xFFFFFFFFFFFFFF9D: the negative value has been sign-extended to 64 bits, filling the top with Fs.

Important: Sign extension is only correct for signed values. For an unsigned value you zero-extend (fill the new high bits with 0). RISC-V provides both: lb sign-extends a loaded byte, while lbu zero-extends it.

Sign Extension by Shifting

A common trick to sign-extend a sub-word value that is sitting in the low bits of a register is "shift left all the way, then arithmetic-shift right all the way." The arithmetic right shift (sra / srai) replicates the sign bit as it shifts:

int32_t v = 0b1110;          // we *mean* -2 as a 4-bit value
v = (v << 28) >> 28;         // shift the 4 bits to the top, then SRA back
// v is now -2 (0xFFFFFFFE)

In assembly, the right shift must be the arithmetic form (srai), not the logical form (srli), or you would zero-fill and get the wrong answer for negatives.

Narrowing: Truncation

Going the other way — from more bits to fewer — you simply keep the low bits and discard the high bits. This is truncation.

int32_t big = -3;            // 0xFFFFFFFD
int8_t  small = (int8_t)big; // keep low 8 bits: 0xFD = -3   (value survives)

int32_t huge = 300;          // 0x0000012C
int8_t  tiny = (int8_t)huge; // keep low 8 bits: 0x2C = 44   (value LOST!)

Truncation is safe only when the value actually fits in the narrower type. -3 fits in 8 bits, so it survives. 300 does not fit in a signed 8-bit range (-128 .. 127), so the result wraps to 44. Always make sure a value fits before narrowing.


7. Strings: Arrays of Character Bytes

A C string is an array of characters (bytes) terminated by a special null byte '\0' (the value 0). There is no separate length field — the terminating zero is how code knows where the string ends.

Consider:

char *s = "foo";

s is a pointer to the first character. The string "foo" occupies four bytes in memory (three letters plus the terminator):

            address       byte value (hex / char)
            grows up
          +-----------+
   s[3]   |  '\0'  0  |   <- terminator, value 0
          +-----------+
   s[2]   |  'o'  6F  |
          +-----------+
   s[1]   |  'o'  6F  |
          +-----------+
   s[0]   |  'f'  66  |   <- s points here
          +-----------+

Each character is one byte holding its ASCII code ('f' = 0x66 = 102, 'o' = 0x6F = 111). The final s[3] holds 0, which is not the digit '0' (that would be 0x30) — it is the integer zero that marks end-of-string. Walking a string means starting at s and reading bytes until you hit the zero.

Choosing the Right Load Width

When you access memory in assembly, you must use the load instruction that matches the size of the thing you are reading. The handwritten notes list the three common widths:

Instruction Width Reads Typical C type
lw 32 bits a word (4 bytes) int, uint32_t
ld 64 bits a doubleword (8 bytes) long, pointer, uint64_t
lb 8 bits one byte (sign-extended) char, a single string character

For strings, characters are bytes, so we use lb to read a character and sb to write one.

# a0 points at a C string; load the first character
lb t0, 0(a0)        # t0 = s[0], sign-extended to 64 bits

lb Touches Only the Low 8 Bits

When you load or store a byte, only the low 8 bits of the register are involved. The handwritten note "lb t0, (a0) ... lower 8 bits set to byte value" captures this:

  • lb t0, (a0) reads one byte from memory and places it in the low 8 bits of t0, then sign-extends it into bits 8–63.
  • sb t0, (a0) writes only the low 8 bits of t0 out to one byte of memory; the upper 56 bits of t0 are ignored.
   register t0 (64 bits)
   +----------------------------------+--------+
   |  sign-extended upper 56 bits     | byte 0 |   <- lb writes here / sb reads here
   +----------------------------------+--------+
                                       \______/
                                       lower 8 bits = byte value

Use lbu instead of lb when the byte should be treated as unsigned (0–255) so that it is zero-extended rather than sign-extended.

Iterating a String in C

The canonical "string length" loop walks the pointer until it finds the null terminator:

int my_strlen(char *s) {
    int len = 0;
    while (*s != '\0') {   // stop at the null byte
        len++;
        s++;               // advance one byte (char is 1 byte)
    }
    return len;
}

Iterating a String in RISC-V Assembly

The same logic in RISC-V. We load one byte at a time with lb, stop when it is zero, and otherwise bump the count and advance the pointer by 1 (because each character is one byte):

# int my_strlen(char *s)
#   a0 = s (pointer to string)
# returns length in a0
.global my_strlen
my_strlen:
    li   t0, 0            # t0 = len = 0
strlen_loop:
    lb   t1, 0(a0)        # t1 = *s (current character byte)
    beq  t1, zero, strlen_done   # if byte == '\0', stop
    addi t0, t0, 1        # len++
    addi a0, a0, 1        # s++  (advance one byte)
    j    strlen_loop
strlen_done:
    mv   a0, t0           # return value goes in a0
    ret

Two details worth highlighting, both raised in lecture:

  • Clear loop labels. Naming the loop (strlen_loop) and its exit (strlen_done) makes the control flow readable.
  • Where the return value goes. By convention the result is returned in a0, so we copy the computed length there before ret.

String Copy in Assembly

Copying a string is the same idea with both a load and a store each iteration. Using index-based access (i in t2):

void my_strcpy(char *dst, char *src) {
    int i = 0;
    do {
        dst[i] = src[i];     // copy a byte (including the final '\0')
    } while (src[i++] != '\0');
}
# void my_strcpy(char *dst, char *src)
#   a0 = dst, a1 = src
.global my_strcpy
my_strcpy:
    li   t2, 0            # i = 0
strcpy_loop:
    add  t3, a1, t2       # t3 = &src[i]
    lb   t1, 0(t3)        # t1 = src[i]
    add  t4, a0, t2       # t4 = &dst[i]
    sb   t1, 0(t4)        # dst[i] = src[i]
    beq  t1, zero, strcpy_done   # stop AFTER copying the '\0'
    addi t2, t2, 1        # i++
    j    strcpy_loop
strcpy_done:
    ret

Because each element is one byte, the index offset is the byte offset — there is no * 4 scaling as there would be for an int array. Copying the null terminator before exiting is essential, otherwise the destination would not be a valid C string.

Calling C Library Functions from Assembly

You do not always have to reimplement string routines. Project 3 notes that you can call the C library directly from assembly: declare the symbol .global, set the argument registers, and call it (remember to save any caller-saved registers you still need across the call):

.global strlen
# ...
mv   a0, s0          # a0 = pointer to string
call strlen          # a0 = strlen(s0)

Key Concepts

Concept Definition Example
Byte addressable Every byte in memory has its own unique address &x, &x + 1, &x + 2, …
Byte The smallest addressable unit; 8 bits 0x000xFF
Endianness The order in which the bytes of a multi-byte value are stored RISC-V is little-endian
Big-endian Most significant byte at the lowest address 0xFFAA1122FF AA 11 22
Little-endian Least significant byte at the lowest address 0xFFAA112222 11 AA FF
Sign-magnitude MSB is sign, rest is magnitude; has two zeros and broken arithmetic 1101 = -5
Two's complement Signed scheme where ordinary binary addition is correct 1101 (4-bit) = -3
Negate invert(v) + 1; its own inverse 00111101 (3 → -3)
Sign extension Widening by replicating the sign bit into new high bits 1101 (-3, 4b) → 1111 1101 (-3, 8b)
Zero extension Widening an unsigned value by filling new high bits with 0 lbu of 0xFD0x00000000000000FD
Truncation Narrowing by keeping the low bits, discarding the high bits (int8_t)0xFFFFFFFD = -3
C string Null-terminated array of character bytes "foo" = 'f' 'o' 'o' '\0'
lb / sb Load/store a single byte (low 8 bits of a register) lb t0, 0(a0) reads s[0]

Practice Problems

Problem 1: Byte Layout

The 32-bit value int y = 0x12345678; is stored at address 0x2000. Write out the byte at each of 0x2000, 0x2001, 0x2002, 0x2003 under both big-endian and little-endian, and state which one RISC-V uses.

Click to reveal solution The four bytes from MSB to LSB are `0x12 0x34 0x56 0x78`. | Address | Big-endian | Little-endian | |---------|-----------|---------------| | `0x2000` | `0x12` (MSB) | `0x78` (LSB) | | `0x2001` | `0x34` | `0x56` | | `0x2002` | `0x56` | `0x34` | | `0x2003` | `0x78` (LSB) | `0x12` (MSB) | RISC-V is **little-endian**, so reading a byte at `0x2000` (the lowest address) yields `0x78`, the least significant byte.

Problem 2: Negate a Value

Compute the 8-bit two's complement representation of -5. Then verify your answer by negating it back to +5.

Click to reveal solution Start with `+5 = 0000 0101`.
invert:  1111 1010
+ 1:     1111 1010 + 1 = 1111 1011
So `-5 = 1111 1011 = 0xFB`. Verify by negating again:
invert:  0000 0100
+ 1:     0000 0100 + 1 = 0000 0101 = 5
We get `+5` back, confirming `invert(v) + 1` is its own inverse.

Problem 3: Sign Extension

The 4-bit two's complement value 1010 is to be widened to 8 bits. What is the 8-bit pattern, and what decimal value does it represent? What if 1010 were an unsigned 4-bit value loaded with lbu-style zero extension?

Click to reveal solution As a **signed** 4-bit value, the sign bit of `1010` is `1`, so we sign-extend with `1`s:
1010  ->  1111 1010
Decimal value: invert `1111 1010` → `0000 0101`, add 1 → `0000 0110 = 6`, so the value is `-6`. (Check the 4-bit table: `1010` = `-6`.) Sign extension preserves the value. As an **unsigned** 4-bit value, `1010` = 10, and zero extension gives:
1010  ->  0000 1010  = 10
Same low bits, but the upper bits are zeros, and the value is `+10`. This is why `lb` (sign-extend) and `lbu` (zero-extend) give different results for bytes with the top bit set.

Problem 4: What Does This C Print?

int x = 1;
unsigned char *p = (unsigned char *)&x;
printf("%d\n", p[0]);

On a RISC-V (little-endian) machine, what is printed, and why?

Click to reveal solution `int x = 1` is `0x00000001`. The four bytes are `01 00 00 00` from low address to high on a **little-endian** machine, because the least significant byte (`0x01`) is stored first. `p` points at the lowest address (`&x`), so `p[0]` reads `0x01`. The program prints:
1
On a big-endian machine `p[0]` would be `0x00` and it would print `0`. This three-line program is a classic endianness detector.

Problem 5: String in Memory

Draw the byte-by-byte memory layout of char *s = "Hi!";. How many bytes does it occupy? What does s[3] hold?

Click to reveal solution The string `"Hi!"` is 3 visible characters plus the null terminator = **4 bytes**.
   s[3]  |  '\0'  0x00  |   <- terminator
   s[2]  |  '!'   0x21  |
   s[1]  |  'i'   0x69  |
   s[0]  |  'H'   0x48  |   <- s points here
`s[3]` holds the integer `0` (the null terminator `'\0'`), which marks the end of the string. It is *not* the character `'0'` (which would be `0x30`).

Problem 6: Strlen Trace

Trace my_strlen (from Section 7) on the string "Hi!". How many times does the loop body execute, and what is returned in a0?

Click to reveal solution `a0` starts pointing at `'H'`. `t0` (len) starts at 0. | Iteration | byte loaded (`lb t1`) | `beq` taken? | len after | a0 advanced to | |-----------|----------------------|--------------|-----------|----------------| | 1 | `'H'` (0x48) | no | 1 | `'i'` | | 2 | `'i'` (0x69) | no | 2 | `'!'` | | 3 | `'!'` (0x21) | no | 3 | `'\0'` | | 4 | `'\0'` (0x00) | **yes → done** | 3 | — | The loop body that increments runs **3 times** (once per visible character). On the 4th load the byte is `0`, so `beq t1, zero, strlen_done` branches out before incrementing. `mv a0, t0` puts `3` in `a0`, and the function returns **3**, the correct length.

Further Reading


Summary

  1. The processor computes on registers; memory holds code and data. Loads and stores move bytes between registers and RAM, which is partitioned into stack, heap, data, and code.

  2. Memory is a byte-addressable array. A byte is 8 bits and is the smallest addressable unit; larger values occupy several consecutive bytes.

  3. Endianness is the order of those bytes. Big-endian stores the most significant byte at the lowest address; little-endian stores the least significant byte first. RISC-V is little-endian, and there is no universal standard, which is why network protocols define a canonical byte order.

  4. Two's complement is the standard for signed integers because it has a single representation of zero, keeps the MSB as a sign bit, and makes ordinary binary addition correct for both positive and negative numbers — beating sign-magnitude and one's complement.

  5. Negate with invert(v) + 1. Flip all the bits and add one. The same operation converts negative to positive, so it is its own inverse.

  6. Widen with sign extension, narrow with truncation. To go to more bits, replicate the sign bit into the new high bits (zero-extend for unsigned values). To go to fewer bits, keep the low bits and discard the rest — safe only when the value fits.

  7. C strings are null-terminated byte arrays. Each character is one byte holding an ASCII code, and a trailing '\0' (value 0) marks the end; code walks the string until it finds that zero.

  8. Use the load width that matches the data. lb/sb for single bytes (only the low 8 bits of a register), lw for 32-bit words, ld for 64-bit doublewords; advance a string pointer by 1 byte per character.