← Back to Course
# Lab: Bit Manipulation ## CS 315 Computer Architecture --- ## Learning Objectives - Distinguish **bitwise** (`~ & | ^ << >>`) from **logical** (`&& || !`) operators - Recall truth tables for AND, OR, XOR, NOT from memory - Convert between binary and decimal using place values - Apply SLL, SRL, and SRA — predict the fill bits - Connect arithmetic shifts to signed division by powers of two - Map each C operator to its RISC-V instruction - Extract bit fields with shift + mask (foundation for the emulator) --- ## Where This Lab Fits
flowchart TD C["C / Data Representation"] --> ASM["RISC-V Assembly"] ASM --> MC["RISC-V Machine Code (rv_emu)"] DD["Digital Design"] --> PD["Processor Design"] MC --> PD style MC fill:#fff3cd,stroke:#00543c,stroke-width:2px
Decoding 32-bit instructions = pure bit manipulation: shift a field down, mask off unwanted bits.
--- ## The `rstr` Bug — Setup Reverse string `src` into `dst`. `src = "FooBar"`, `len = 6`. ```text src: f o o B a r \0 idx: 0 1 2 3 4 5 6 <- \0 at index len ``` Walk `s` downward from `len-1 = 5` toward 0, copying into `dst`. --- ## The `rstr` Bug — Wrong Termination ```c // BUGGY: walks s downward, never hits src's own '\0' while (src[s] != '\0') { dst[d++] = src[s--]; } ``` `src`'s `'\0'` is at index `len` — we started **below** it. The loop reads before `src` (undefined behavior), stopping only when it hits a neighboring string's `'\0'` by accident. --- ## The `rstr` Bug — Memory Layout ```text high addr | \0 | | r | <- "bar" (src points here, s walks downward) | a | | b | | \0 | <- accidental stop: neighboring string's null | o | | o | <- "foo" | f | low addr ```
Never let a loop read memory it doesn't own. Terminate on your own indices.
--- ## The `rstr` Bug — Correct Fix ```c void rstr(char *dst, char *src, int len) { int s = len - 1, d = 0; while (s >= 0) { // stop when we run off the front dst[d++] = src[s--]; } dst[d] = '\0'; // always terminate destination } ``` Stop based on **your own index** `s`, not incidental neighboring data. --- ## Bitwise vs. Logical Operators | Category | C Operators | Purpose | |----------|-------------|---------| | **Bitwise** | `~` `&` `\|` `^` `<<` `>>` | Manipulate individual bits | | **Logical** | `&&` `\|\|` `!` | Produce 0/1 for control flow | ```c uint8_t flags = 0b1100; uint8_t mask = 0b0100; uint8_t bit = flags & mask; // 0b0100 — bitwise, a bit pattern int ready = (count > 0) && (error == 0); // 1 or 0 — logical ```
&
vs
&&
confusion is a classic bug!
--- ## Bits, Bytes, and Words ```text bit: 7 6 5 4 3 2 1 0 value: 128 64 32 16 8 4 2 1 ^ ^ MSB LSB ``` ```text 0b11011001 = 128+64+16+8+1 = 217 ``` A **word** = 32 bits = 4 bytes: ```text 31 24 23 16 15 8 7 0 | byte 3 | byte 2 | byte 1 | byte 0 | ``` --- ## Bit States — Two Names for Everything | `0` | `1` | |-----|-----| | false | true | | unset / clear | set | | low | high | | off | on | --- ## AND (`&`) Truth Table `1` only when **both** inputs are `1`. Used for **masking**. | a | b | a & b | |---|---|-------| | 0 | 0 | 0 | | 0 | 1 | 0 | | 1 | 0 | 0 | | 1 | 1 | 1 | ```text a = 1 1 0 0 1 0 1 0 (0xCA) b = 1 0 0 1 1 0 0 1 (0x99) a&b = 1 0 0 0 1 0 0 0 (0x88) ``` --- ## OR (`|`) and XOR (`^`) Truth Tables **OR** — `1` if either input is `1`. Used to **set bits**. | a | b | a \| b | |---|---|--------| | 0 | 0 | 0 | | 0 | 1 | 1 | | 1 | 1 | 1 | **XOR** — `1` when inputs **differ**. Used to **toggle bits**. | a | b | a ^ b | |---|---|-------| | 0 | 1 | 1 | | 1 | 0 | 1 | | 1 | 1 | 0 | XOR identities: `a ^ a = 0`, `a ^ 0 = a`, `a ^ 0xFF` flips all 8 bits. --- ## NOT (`~`) — Bitwise Complement Inverts **every** bit (unary operator). ```text a = 1 1 0 0 1 0 1 0 (0xCA) ~a = 0 0 1 1 0 1 0 1 (0x35) ```
~
is bitwise NOT (flips all bits).
!
is logical NOT (nonzero → 0, zero → 1).
--- ## Worked Example — All Four Operators `a = 0b11001010` (0xCA), `b = 0b10011001` (0x99) ```text a = 1 1 0 0 1 0 1 0 b = 1 0 0 1 1 0 0 1 a&b = 1 0 0 0 1 0 0 0 = 0x88 a|b = 1 1 0 1 1 0 1 1 = 0xDB a^b = 0 1 0 1 0 0 1 1 = 0x53 ~a = 0 0 1 1 0 1 0 1 = 0x35 ``` --- ## RISC-V Logical Instructions | C operator | R-type | I-type | |------------|--------|--------| | `a & b` | `and rd, rs1, rs2` | `andi rd, rs1, imm` | | `a \| b` | `or rd, rs1, rs2` | `ori rd, rs1, imm` | | `a ^ b` | `xor rd, rs1, rs2` | `xori rd, rs1, imm` | | `~a` | — | `xori rd, rs1, -1` | ```asm and a0, a0, a1 # a0 = a0 & a1 xori a0, a0, -1 # a0 = ~a0 (XOR with all-ones) ``` No dedicated NOT instruction — `not rd, rs` is a pseudo that expands to `xori rd, rs, -1`. --- ## Shift Operations — Overview ```text a << 2 ^ ^ value shift amount ``` Three shifts — differ only in **what fills the vacated bits**: | Operation | Direction | Fill bits | |-----------|-----------|-----------| | SLL `<<` | left | `0` on right | | SRL `>>` (unsigned) | right | `0` on left | | SRA `>>` (signed) | right | **sign bit** on left | --- ## SLL — Shift Left Logical Fill vacated low bits with `0`. Effect: multiply by `2^n`. ```text a = 0b 1 1 0 0 1 0 1 0 (0xCA = 202) a << 2 = 0b 0 0 1 0 1 0 0 0 (0x28 = 40) ^ ^ new zeros ``` Bits shifted off the top are **discarded** — can overflow. --- ## SRL — Shift Right Logical Fill vacated high bits with `0`. Effect: unsigned divide by `2^n`. ```text a = 0b 1 1 0 0 1 0 1 0 (0xCA = 202) a >> 2 = 0b 0 0 1 1 0 0 1 0 (0x32 = 50) ^ ^ new zeros ``` Used with **unsigned** types (`uint8_t`, `uint32_t`). --- ## SRA — Shift Right Arithmetic Fill vacated high bits with the **sign bit**. Effect: signed divide by `2^n`. ```text a = 0b11001010 as int8_t = -54 (MSB = 1, negative) a >> 2 = 0b 1 1 1 1 0 0 1 0 = -14 ^ ^ copies of sign bit (1) ```
-54 / 4 = -13.5. SRA rounds toward -infinity → -14.
C integer division
/
rounds toward zero → -13.
--- ## Verifying SRA with Two's Complement Check that `0b11110010 = -14`: ```text result = 1 1 1 1 0 0 1 0 invert = 0 0 0 0 1 1 0 1 add 1 = 0 0 0 0 1 1 1 0 = 8+4+2 = 14 => result = -14 ✓ ``` Check that `0b11001010 = -54`: ```text a = 1 1 0 0 1 0 1 0 invert = 0 0 1 1 0 1 0 1 add 1 = 0 0 1 1 0 1 1 0 = 32+16+4+2 = 54 => a = -54 ✓ ``` --- ## Choosing the Right Shift in C ```c uint8_t u = 0b11001010; // unsigned int8_t s = 0b11001010; // signed (-54) u >> 2; // SRL: 0b00110010 = 50 s >> 2; // SRA: 0b11110010 = -14 ``` - **Unsigned type** → `>>` is logical (SRL) - **Signed type** → `>>` is arithmetic (SRA) - `<<` always fills with `0` (logical)
Left-shifting a negative signed value is undefined behavior in C. Use unsigned types when manipulating bits.
--- ## RISC-V Shift Instructions ```asm slli a0, a0, 2 # a0 = a0 << 2 (immediate shift) srli a0, a0, 2 # a0 = a0 >> 2 (logical) srai a0, a0, 2 # a0 = a0 >> 2 (arithmetic, sign-preserving) sll a0, a1, a2 # a0 = a1 << a2 (register shift amount) ``` Implemented in hardware as a **barrel shifter** — shifts any amount in a single clock cycle, no looping. --- ## Masking — Extracting Bit Fields To pull a `w`-bit field starting at bit `start`: 1. **Shift** the field down to bit 0: `value >> start` 2. **Mask** with `(1 << w) - 1` | `w` | mask | hex | |-----|------|-----| | 4 | `0b00001111` | `0x0F` | | 5 | `0b00011111` | `0x1F` | | 7 | `0b01111111` | `0x7F` | | 8 | `0b11111111` | `0xFF` | --- ## Masking — Worked Example Extract bits 2–5 of `x = 0b11011010`: ```text x = 1 1 0 1 1 0 1 0 x >> 2 = 0 0 1 1 0 1 1 0 (bits 2-5 now at 0-3) & 0b1111 = 0 0 0 0 0 1 1 0 = 0x6 ``` ```c uint8_t x = 0b11011010; uint8_t v = (x >> 2) & 0x0F; // v = 6 ``` --- ## RISC-V Instruction Field Extraction A 32-bit R-type instruction word: ```text | funct7 [31:25] | rs2 [24:20] | rs1 [19:15] | funct3 [14:12] | rd [11:7] | opcode [6:0] | ``` Extracting each field: ```c uint32_t opcode = iw & 0x7F; // bits 6:0 uint32_t rd = (iw >> 7) & 0x1F; // bits 11:7 uint32_t funct3 = (iw >> 12) & 0x07; // bits 14:12 uint32_t rs1 = (iw >> 15) & 0x1F; // bits 19:15 uint32_t rs2 = (iw >> 20) & 0x1F; // bits 24:20 ``` --- ## Building Values: Shift + OR The inverse of extraction — assemble a value from pieces: ```c uint32_t imm = (hi << 5) | lo; // place hi above lo, then merge ``` Used when reconstructing branch/jump immediates whose bits are scattered across an instruction word. --- ## Sign Extension with Shifts To sign-extend a narrow two's complement value to 32 bits: ```c // Treat as a 4-bit signed value (-2 = 0b1110) int32_t v = 0b1110; v = (v << 28) >> 28; // shift sign bit to bit 31, then SRA back // result: 0xFFFFFFFE = -2 in 32 bits ``` Works because `>>` on `int32_t` is arithmetic (SRA) — it replicates the sign bit downward. --- ## Four Bit Idioms to Memorize ```c // Set bit n to 1 x = x | (1 << n); // Clear bit n to 0 x = x & ~(1 << n); // Toggle bit n x = x ^ (1 << n); // Test whether bit n is 1 int bit = (x >> n) & 1; ``` --- ## The `rv_emu` Emulator — State ```c #define NREGS 32 #define STACK_SIZE 8192 struct rv_state { uint64_t regs[NREGS]; // x0..x31 uint64_t pc; // program counter uint8_t stack[STACK_SIZE]; }; ``` All of this lives in a C struct. The emulator simulates the processor entirely in software. --- ## The Fetch-Decode-Execute Cycle
flowchart LR F["Fetch\niw = *pc"] --> D["Decode\nopcode = iw & 0x7F"] D --> E["Dispatch\nemu_r_type / emu_i_type ..."] E --> X["Execute\nupdate regs / memory"] X --> A["Advance\npc += 4"] A --> F
Every decode step is **shift + mask**. This lab is the direct prerequisite for Project 4. --- ## Operator Quick Reference | C | RISC-V (reg) | RISC-V (imm) | Effect | |---|--------------|--------------|--------| | `a & b` | `and` | `andi` | mask | | `a \| b` | `or` | `ori` | set bits | | `a ^ b` | `xor` | `xori` | toggle | | `~a` | — | `xori rd,rs,-1` | flip all | | `a << n` | `sll` | `slli` | x 2^n | | `u >> n` | `srl` | `srli` | unsigned / 2^n | | `s >> n` | `sra` | `srai` | signed / 2^n | --- ## Summary 1. **`rstr` bug** — backward loop never hits its own `'\0'`; fix with `s >= 0` 2. **Bitwise vs. logical** — `&` works per-bit; `&&` yields 0 or 1 for control flow 3. **Truth tables** — AND: both 1; OR: either 1; XOR: differ; NOT: flip all 4. **Three shifts** — SLL fills 0 right; SRL fills 0 left; SRA fills sign bit left 5. **Masking pattern** — `(iw >> start) & ((1 << w) - 1)` extracts any field 6. **Four idioms** — set `|`, clear `& ~`, toggle `^`, test `>> & 1` 7. **Emulator** — fetch-decode-execute is shift + mask from end to end