CS 315-02 Lecture/Lab — Meeting Summary (Fall 2025)¶

Quick Recap¶

Discussion of Lab 6 and Project 4, including implementation details and sign-extension strategies.
Explanations of RISC-V shift operations and differences between 32-bit and 64-bit instruction formats.
Overview of cache memory concepts, types of cache implementations, and their role in mitigating the CPU–memory performance gap.

Students: Update bits.c by changing 64 to 63 in the sign-extend function, aligning with the lecture explanation.
Students: Use JAL instead of call in the sort code and the FIBREC implementation for Project 4 to avoid implementing AUIPC.
Students: Implement dynamic analysis in Project 4 by updating the analysis struct counts accurately (e.g., instruction-type counters).
Students: Read the cache memory guide before the next lecture.
Greg: Release Lab 7 (practice). Due Monday.

Addressed a discrepancy between the lecture’s sign-extension explanation and the provided code for Lab 6. Students may keep the original code or modify it to match the lecture; the lecture-aligned change is preferred for logical consistency.
Announced Lab 7 (due Monday) and a midterm practice resource.
Planned coverage includes the dynamic analysis component of Project 4 and an introduction to cache memory.
A deeper dive into implementing the cache simulator is scheduled for the next lecture.

Clarified differences between 32-bit and 64-bit SRLI (Shift Right Logical Immediate):
64-bit SRLI requires an additional bit in the shift amount field due to the larger register width.
The shift amount is treated as unsigned (not sign-extended).
Demonstrated how to verify these details in the RISC-V specification.

Reinforced that SRLI’s shift amount is an unsigned field rather than a signed value.
Demonstrated rapid verification workflows (e.g., scanning the RISC-V manual) to confirm instruction semantics.

Explained that call typically expands into two instructions (including AUIPC), while JAL can be used directly when applicable.
Recommended:
Replace call with JAL in the provided sort code.
Use JAL in the FIBREC implementation to avoid emulating AUIPC.

Demonstrated dynamic analysis to collect execution metrics:
Count instruction classes such as I-type, R-type, load, store, and branch.
Example: Running FIBREC with input 10 produced 1,675 total instructions.
Compared memory access profiles for FIBREC vs. sorting, emphasizing how different programs exhibit different memory behavior.
Guidance:
Prioritize correctness of program execution and analysis counts.
Maintain clean, non-redundant instrumentation code.

Reviewed the historical trend: transistor counts approximately double every ~1.5 years (Moore’s Law), historically boosting performance.
Noted the widening gap between CPU speeds and main memory latency, motivating the need for cache hierarchies.

Contrasted SRAM vs. DRAM:
SRAM: faster, more expensive, less dense.
DRAM: slower, cheaper, more dense.
Explained why caches are effective: they keep frequently accessed data in faster memory, reducing access time.
Clarified why SRAM is not used exclusively: cost and area constraints.

Key reasons caches work well:
Temporal locality: recently accessed data/instructions tend to be reused.
Spatial locality: nearby data is likely to be accessed soon (e.g., sequential instructions, array traversal, loops).
Three core cache questions:
Where is a given address located (mapping to cache)?
Is the data present (hit/miss detection)?
Which data should be evicted when the cache is full (replacement policy)?

Introduced common cache organizations:
Direct-mapped
Fully associative
Set-associative (often a practical balance of flexibility and efficiency)
Next session will expand on these topics.
Students are encouraged to review the cache memory guide; Greg and Shreyas are available for support.