Rust
Status: 🌱
Motivation
Study Rust to deepen systems thinking, memory reasoning, CLI design, and networking fundamentals in contexts where these concerns are central.
Starter Points
- Practice ownership patterns on real I/O-heavy examples.
- Model domain invariants with type-driven design.
- Benchmark critical paths and document tradeoffs.
- Convert reading fluency into writing fluency through short, complete tools.
Recent Learnings
Ownership
Every value has a single owner. When the owner goes out of scope, Rust runs drop automatically and releases resources deterministically.
Why this matters: - Predictable cleanup without garbage collection. - Fewer hidden lifetime/resource bugs in long-running services and CLI tools.
Borrowing
References (&T or &mut T) let code use values without taking ownership.
Why this matters: - Share read access safely without cloning by default. - Mutability is explicit and constrained, which improves API clarity.
Lifetimes
Lifetimes encode how long references are valid. The compiler rejects any reference that could outlive its source value.
Why this matters: - Prevents dangling references at compile time. - Makes data-flow constraints explicit in function signatures.
Traits
Traits define behavior contracts and enable composition over inheritance.
Why this matters: - Build reusable abstractions with explicit capabilities. - Keep designs modular by composing behavior through trait bounds and implementations.
Bits, Bytes and Meaning
A byte is eight bits. Eight binary positions yield 2^8 = 256 distinct bit patterns, from 00000000 to 11111111.
That pattern is representation. Meaning is interpretation. The same stored byte can be:
255asu8-1asi8'A'as text when interpreted through ASCII/UTF-8 rules
ASCII is a 7-bit character set (0..127). "Extended ASCII" is an umbrella label for multiple incompatible 8-bit code pages (0..255), not one universal standard.
UTF-8 is a variable-length encoding:
- ASCII characters use one byte (
0xxxxxxx) - Other code points use two to four bytes
- Backward compatible with ASCII at the byte level
In Rust, String::len() returns bytes, not scalar values or grapheme clusters:
fn main() {
let ascii = "A"; // U+0041
let latin = "é"; // U+00E9
let cjk = "界"; // U+754C
assert_eq!(ascii.len(), 1);
assert_eq!(latin.len(), 2);
assert_eq!(cjk.len(), 3);
}
This is consistent with Rust's model: strings are UTF-8 byte buffers with validity guarantees.
Signed vs Unsigned Integers
u8, i8, u32, and i32 differ by interpretation, not storage size within each width pair:
u8andi8: 8 bits eachu32andi32: 32 bits each
Range examples:
u8:0..=255i8:-128..=127u32:0..=4_294_967_295i32:-2_147_483_648..=2_147_483_647
Two's complement defines signed interpretation in modern hardware:
- Highest bit is the sign contribution (
-2^(n-1)) - Remaining bits contribute positive powers of two
- Negation is bitwise invert plus one
For i8, -1 is 11111111:
- Invert
00000001->11111110 - Add
1->11111111
So the same physical byte 0xFF maps to:
255asu8-1asi8
Endianness
"Most significant" and "least significant" refer to positional weight, not memory location.
Decimal analogy for 4827:
4means4 * 10^3(most significant digit)7means7 * 10^0(least significant digit)
Binary is identical in principle: leftward bits carry higher powers of two.
For 0x12345678 in memory:
- Big-endian:
12 34 56 78 - Little-endian:
78 56 34 12
x86 is little-endian by architecture design and compatibility lineage. Network byte order is big-endian by protocol convention (historically standardized for interoperability), so protocol documents read multi-byte fields in a single canonical order.
Little-endian simplifies some arithmetic and microarchitectural paths because low-order bytes are at the lowest addresses:
- Incrementing counters often touches low bytes first
- Partial-width operations align naturally with low-addressed bytes
Network Protocols and Byte Order
DNS starts with a fixed 12-byte header:
ID(16)Flags(16)QDCOUNT(16)ANCOUNT(16)NSCOUNT(16)ARCOUNT(16)
All are transmitted in network byte order (big-endian). That rule ensures two hosts with opposite native endianness parse identical byte streams.
In Rust, conversion should be explicit at boundaries:
#[derive(Debug)]
struct DnsHeader {
id: u16,
flags: u16,
qdcount: u16,
ancount: u16,
nscount: u16,
arcount: u16,
}
fn parse_dns_header(buf: [u8; 12]) -> DnsHeader {
DnsHeader {
id: u16::from_be_bytes([buf[0], buf[1]]),
flags: u16::from_be_bytes([buf[2], buf[3]]),
qdcount: u16::from_be_bytes([buf[4], buf[5]]),
ancount: u16::from_be_bytes([buf[6], buf[7]]),
nscount: u16::from_be_bytes([buf[8], buf[9]]),
arcount: u16::from_be_bytes([buf[10], buf[11]]),
}
}
fn serialize_id(id: u16) -> [u8; 2] {
id.to_be_bytes()
}
The core rule: internal representation may be native-endian, but wire/storage formats must be explicit.
Personal Insight
The main shift was from language-level intuition to memory-level reasoning.
Types stopped being syntax and became interpretation contracts for raw bytes. That reframing made several Rust behaviors feel coherent instead of surprising:
- byte-oriented string length
- strict numeric conversions
- explicit boundary handling for I/O and protocols
Learning the hard part was gratifying because it removed "magic." Once bytes, layout, and interpretation were explicit, Rust felt less like a new language to memorize and more like a precise system to reason about.