PFC (Prefix-Free Codecs) is a header-only C++20 library that demonstrates data compression and zero-copy access are compatible goals when built on prefix-free codes and generic programming. The library achieves 3-10× compression on typical integer distributions while maintaining full STL integration and type safety.
The Core Innovation: Zero-Copy Invariant
Traditional compression requires marshaling:
// Traditional approach
std::vector<uint32_t> data = {1, 2, 3, 5, 8, 13};
// Step 1: Compress (marshal to wire format)
auto compressed = compress(data);
// Step 2: Store/transmit compressed data
store_to_disk(compressed);
// Step 3: Decompress (unmarshal back)
auto restored = decompress(compressed);
// Problem: data lives in two worlds
// - Uncompressed in memory (32 bits × n)
// - Compressed on wire/disk (variable bits)
PFC’s approach: The zero-copy invariant
\[ \text{In-memory representation} = \text{Wire representation} \]// PFC approach
PackedContainer<uint32_t, EliasGamma> data;
data.push_back(1);
data.push_back(2);
data.push_back(3);
// Data is ALREADY compressed in memory
// Access through proxy decodes on-the-fly
uint32_t value = data[0]; // Decodes from compressed form
// Write to disk? Zero copy!
write(fd, data.bytes().data(), data.bytes().size());
The data structure IS the compressed format. No marshaling step exists.
Prefix-Free Codes: The Foundation
PFC is built on prefix-free codes—self-delimiting encodings where you can determine where one value ends and the next begins without external delimiters.
The Prefix-Free Property
A code is prefix-free if no codeword is a prefix of another:
✅ Prefix-free:
1 → 0
2 → 10
3 → 110
4 → 1110
❌ Not prefix-free:
1 → 0
2 → 01 ← "0" is a prefix of "01"
3 → 011
Why this matters: Prefix-free codes compose naturally:
// Encode multiple values without delimiters
encode(1); // 0
encode(2); // 10
encode(3); // 110
// Result: 010110 (self-delimiting!)
// Decode unambiguously
decode(); // Reads "0" → 1
decode(); // Reads "10" → 2
decode(); // Reads "110" → 3
This enables streaming and composition.
Universal Codes: Optimal Without Prior Knowledge
PFC implements several universal codes—asymptotically optimal for any distribution without knowing the distribution in advance.
Elias Gamma
Encodes positive integer \(n\) in \(2\lfloor\log_2 n\rfloor + 1\) bits:
n Binary Elias Gamma
1 1 1
2 10 010
3 11 011
4 100 00100
5 101 00101
8 1000 0001000
Encoding algorithm:
- Write \(\lfloor\log_2 n\rfloor\) zeros
- Write binary representation of \(n\)
Mathematical property:
\[ L(n) = 2\lfloor\log_2 n\rfloor + 1 \approx 2\log_2 n \]This is asymptotically optimal for geometric distributions.
Fibonacci Encoding
Uses Fibonacci sequence for encoding:
n Fibonacci Encoding
1 1 11
2 2 011
3 3 0011
4 5 1011
5 8 00011
Property: Every positive integer has a unique Zeckendorf representation (sum of non-consecutive Fibonacci numbers).
Encoding: Find representation, mark with terminal “11”.
Rice/Golomb Codes
Parametric codes optimal for geometric distribution with parameter \(p\):
// Rice code with parameter k=3
RiceCodec<3> codec;
// Optimal for geometric distribution with p = 1/2^3
Encoding:
- Quotient \(q = \lfloor n/2^k \rfloor\) in unary
- Remainder \(r = n \bmod 2^k\) in binary
Excellent for values with exponential decay.
Core Architecture
Bit-Level I/O
// Bit-precise writing
BitWriter writer;
writer.write_bit(1);
writer.write_bits(0b1010, 4);
writer.flush();
// Bit-precise reading
BitReader reader(bytes);
bool bit = reader.read_bit();
uint8_t nibble = reader.read_bits(4);
Key insight: All codecs work at the bit level, enabling optimal packing without byte alignment waste.
Codec Concept
template<typename C>
concept Codec = requires(C codec, uint64_t value, BitWriter& writer, BitReader& reader) {
// Encoding
{ codec.encode(value, writer) } -> std::same_as<void>;
// Decoding
{ codec.decode(reader) } -> std::same_as<uint64_t>;
// Properties
{ C::is_fixed_size() } -> std::same_as<bool>;
{ C::max_encoded_bits() } -> std::same_as<size_t>;
};
C++20 concepts ensure type safety at compile time.
Packed<T, Codec> Template
template<typename T, Codec C>
class Packed {
std::vector<uint8_t> bytes_;
public:
void set(T value) {
BitWriter writer(bytes_);
C::encode(value, writer);
}
T get() const {
BitReader reader(bytes_);
return C::decode(reader);
}
};
Key property: The Packed object stores data in compressed form. Access decodes on-the-fly.
Algebraic Type System
PFC provides a complete algebraic type system with optimal encoding:
Product Types
// Pair with minimal overhead
using Point = PackedPair<
Packed<int, Signed<EliasGamma>>,
Packed<int, Signed<EliasGamma>>
>;
Point p{10, 20}; // Compressed in memory
auto [x, y] = p; // Decode on access
Encoding: Concatenate encodings (no delimiters needed—prefix-free!)
Sum Types (Variants)
// Variant with minimal discriminator
using Value = PackedVariant<
Packed<int, EliasGamma>,
Packed<double, FloatingPoint<32>>,
Packed<std::string, PrefixedString>
>;
Value v = 42; // Encodes: discriminator + value
Discriminator optimization: For \(n\) types, uses \(\lceil\log_2 n\rceil\) bits.
Optional Types
// Optional with 1-bit overhead
PackedOptional<Packed<int, EliasGamma>> maybe_value = 42;
PackedOptional<Packed<int, EliasGamma>> no_value = std::nullopt;
Encoding: 1 bit for presence + value if present.
\[ L(\text{optional}) = \begin{cases} 1 & \text{if none} \\ 1 + L(\text{value}) & \text{if some} \end{cases} \]Recursive Types
// Linked list in compressed form
using PackedList = PackedOptional<
PackedPair<
Packed<int, EliasGamma>,
PackedList // Recursive!
>
>;
PackedList list = {{1, {{2, {{3, std::nullopt}}}}}};
PFC handles recursive structures through lazy evaluation and content-addressable storage.
STL Integration: Zero-Copy Iteration
Random Access Container
template<typename T, Codec C>
class PackedContainer {
std::vector<uint8_t> bytes_;
std::vector<size_t> offsets_; // O(1) random access
public:
class Iterator {
// Proxy value that decodes on access
T operator*() const {
BitReader reader(bytes_, offset_);
return C::decode(reader);
}
};
// STL-compliant interface
Iterator begin();
Iterator end();
T operator[](size_t i);
size_t size() const;
};
Key innovation: Offset vector enables O(1) random access without decompressing everything.
STL Algorithm Compatibility
PackedContainer<uint32_t, EliasGamma> data;
// ... fill data ...
// Works with STL algorithms!
auto max = *std::max_element(data.begin(), data.end());
auto sum = std::accumulate(data.begin(), data.end(), 0);
auto it = std::find(data.begin(), data.end(), 42);
std::sort(data.begin(), data.end());
Cost model:
- Random access: O(1) via offsets + decode time
- Sequential access: Amortized O(1) by caching decoder state
- Sorting: O(n log n) comparisons, each requiring decode
Numeric Codecs
Configurable Floating Point
// 16-bit float (1 sign + 5 exponent + 10 mantissa)
using Float16 = FloatingPoint<16, 5>;
// BFloat16 (1 sign + 8 exponent + 7 mantissa)
using BFloat16 = FloatingPoint<16, 8>;
// Custom precision
using Float24 = FloatingPoint<24, 7>; // 17-bit mantissa
Encoding: IEEE-style format with configurable precision.
Precision/size trade-off:
| Format | Bits | Relative Error |
|---|---|---|
| Float16 | 16 | \(2^{-10} \approx 0.1\%\) |
| BFloat16 | 16 | \(2^{-7} \approx 0.8\%\) |
| Float32 | 32 | \(2^{-23} \approx 10^{-7}\) |
| Float64 | 64 | \(2^{-52} \approx 10^{-16}\) |
Rational Numbers
Packed<Rational, RationalCodec<EliasGamma>> ratio{22, 7}; // π approximation
// Encodes numerator and denominator separately
// Optional: continued fraction approximation for compression
Exact representation: No floating-point rounding errors.
Complex Numbers
// Rectangular form
PackedComplex<FloatingPoint<32>> z{3.0, 4.0}; // 3 + 4i
// Polar form (often more compact for unit circle)
PackedComplexPolar<FloatingPoint<16>, EliasGamma> w{1.0, M_PI/4};
Signed Integer Encoding: Zigzag
// Signed integers via zigzag encoding
Packed<int, Signed<EliasGamma>> value = -5;
Zigzag mapping:
\[ \text{zigzag}(n) = \begin{cases} 2n & \text{if } n \geq 0 \\ -2n - 1 & \text{if } n < 0 \end{cases} \]Maps: \(0 \to 0, -1 \to 1, 1 \to 2, -2 \to 3, 2 \to 4, \ldots\)
Benefit: Small magnitude values (positive or negative) have short encodings.
Type Erasure for Runtime Polymorphism
// Store different packed types in same container
TypeErasedPackedContainer container;
container.push_back(PackedInt(42));
container.push_back(PackedRational(22, 7));
container.push_back(PackedString("hello"));
container.push_back(PackedOptional<PackedInt>(std::nullopt));
// Iterate heterogeneously
for (const auto& item : container) {
// Dynamic dispatch based on stored type
std::visit([](const auto& value) {
std::cout << value << "\n";
}, item);
}
Implementation: External polymorphism pattern—virtual methods operate on opaque byte buffers.
Key property: Preserves zero-copy semantics internally despite runtime polymorphism.
Algorithms on Packed Data
Zero-Copy Transform
template<Codec C, typename Func>
auto packed_transform(
PackedContainer<T, C>& input,
Func f
) -> PackedContainer<decltype(f(T{})), C> {
PackedContainer<decltype(f(T{})), C> result;
for (const auto& value : input) {
result.push_back(f(value)); // Decode, transform, re-encode
}
return result;
}
Parallel Processing
#include <execution>
PackedContainer<double, FloatingPoint<16>> data(1000000);
// ... fill with data ...
// Parallel transform with C++17 execution policies
auto squared = packed_transform(
std::execution::par_unseq,
data.begin(), data.end(),
[](double x) { return x * x; }
);
Performance: Decompression can be parallelized trivially—each thread operates on independent segments.
Performance Analysis
Compression Ratios
For typical distributions:
| Distribution | Codec | Bits/Value | Compression |
|---|---|---|---|
| Geometric | Elias Gamma | ~3.3 | 9.7× |
| Uniform [1,100] | Elias Gamma | ~10 | 3.2× |
| Power law | Fibonacci | ~4.5 | 7.1× |
| Sorted sequence | Delta + Gamma | ~2.8 | 11.4× |
Baseline: 32-bit integers = 32 bits/value
Access Overhead
| Operation | Time (ns) | Notes |
|---|---|---|
| Decode Elias Gamma | 6.9 | Single value |
| Decode Fibonacci | 8.3 | Single value |
| Random access | 11.3 | Offset lookup + decode |
| Sequential access | 7.2 | Amortized with decoder state caching |
Comparison: Raw memory access ≈ 2-3 ns
Trade-off: 3-4× slower access for 3-10× compression.
Memory Layout
// Traditional: 32 bits × 1000 = 4000 bytes
std::vector<uint32_t> traditional(1000);
// PFC: ~400 bytes (compressed) + 8000 bytes (offsets) = 8400 bytes
PackedContainer<uint32_t, EliasGamma> packed(1000);
Problem: Offset vector overhead can dominate for small containers.
Solution: Sparse offset vectors (only every N-th element) for large containers:
// Store offset every 64 elements
PackedContainer<uint32_t, EliasGamma, 64> packed(1000000);
// Offsets: 1000000/64 × 8 = 125KB
// Data: ~400KB
// Total: 525KB vs 4MB raw
Design Principles: Stepanov’s Generic Programming
PFC follows Alex Stepanov’s principles:
1. Regular Types
All types are:
- Copyable:
auto x = y; - Assignable:
x = y; - Comparable:
x == y,x < y - Default constructible:
T x;
PackedInt a = 42;
PackedInt b = a; // Copy
a = 10; // Assign
bool same = (a == b); // Compare
PackedInt c; // Default construct
2. Value Semantics
Objects behave like values, not references:
PackedInt a = 5;
PackedInt b = a;
a = 10;
// b is still 5 (not affected by change to a)
3. Separation of Concerns
Orthogonal components:
- Codecs: How to encode/decode
- Containers: Where to store bytes
- Algorithms: What operations to perform
// Mix and match
PackedContainer<int, EliasGamma> gamma_container;
PackedContainer<int, Fibonacci> fib_container;
// Same algorithms work on both
std::sort(gamma_container.begin(), gamma_container.end());
std::sort(fib_container.begin(), fib_container.end());
4. Zero-Cost Abstractions
Compile-time decisions:
if constexpr (Codec::is_fixed_size()) {
// Optimized path for fixed-size codecs
} else {
// General path with offset tracking
}
No runtime virtual dispatch unless explicitly requested (type erasure).
5. Mathematical Foundation
Based on algebraic structures:
- Monoids: Concatenation of encodings
- Functors:
mapover packed containers - Algebras: Sum and product types
Custom Codec Example
struct DeltaCodec {
uint64_t last_value = 0;
template<BitSink S>
void encode(uint64_t value, S& sink) {
uint64_t delta = value - last_value;
EliasGamma::encode(delta, sink);
last_value = value;
}
template<BitSource S>
uint64_t decode(S& source) {
uint64_t delta = EliasGamma::decode(source);
last_value += delta;
return last_value;
}
static constexpr bool is_fixed_size() { return false; }
};
Use case: Sorted sequences (timestamps, IDs) where deltas are small.
Practical Applications
1. Memory-Constrained Embedded Systems
// Store sensor data in compressed form
PackedContainer<float, FloatingPoint<16>> sensor_readings;
// 50% memory savings with acceptable precision
2. Network Transmission
PackedContainer<uint32_t, EliasGamma> data;
// ... fill ...
// Send compressed bytes directly
send(socket, data.bytes().data(), data.bytes().size());
// Receiver decodes on access
uint32_t value = data[i];
Benefit: No serialization/deserialization overhead.
3. Database Storage
// Column store with per-column compression
struct ColumnStore {
PackedContainer<int, DeltaCodec> ids; // Sorted
PackedContainer<double, FloatingPoint<16>> prices;
PackedContainer<std::string, PrefixedString> names;
};
4. Checkpoint/Resume
// Serialize program state
PackedTuple<
Packed<int, EliasGamma>,
PackedList<Packed<double, Float32>>,
PackedOptional<PackedString>
> checkpoint = make_checkpoint();
// Write directly to disk
write(fd, checkpoint.bytes(), checkpoint.size());
Limitations
1. Random access overhead: Offset vectors can be large
Solution: Sparse offsets for large containers
2. Update performance: Inserting requires re-encoding
Solution: Use for append-heavy or read-heavy workloads
3. Fixed codecs: Can’t adapt to data distribution dynamically
Solution: Choose codec based on expected distribution
Future Work
- Huffman coding: Adaptive to actual data distribution
- Arithmetic coding: Asymptotically optimal
- SIMD acceleration: Parallel decoding of multiple values
- Compression-aware algorithms: Exploit encoding structure
- Lazy offset construction: Build offsets on-demand
Comparison with Other Systems
| System | Approach | Zero-Copy | STL-compatible | Type-safe |
|---|---|---|---|---|
| PFC | Intrinsic encoding | ✅ | ✅ | ✅ |
| zlib | External compression | ❌ | ❌ | ❌ |
| Protocol Buffers | Schema-based | ❌ | ❌ | ✅ |
| FlatBuffers | Zero-copy access | ✅ | ❌ | ✅ |
| Cap’n Proto | Zero-copy RPC | ✅ | ❌ | ✅ |
PFC’s niche: Zero-copy + STL compatibility + type safety + flexible compression.
Quick Start
# Header-only library
git clone https://github.com/queelius/pfc.git
cd pfc
mkdir build && cd build
cmake .. -DCMAKE_CXX_STANDARD=20
cmake --build .
./pfc_tests
Basic usage:
#include <pfc/pfc.hpp>
using namespace pfc;
// Compress integers
std::vector<uint32_t> data = {1, 2, 3, 5, 8, 13, 21};
auto compressed = compress<EliasGamma>(data);
// Decompress
auto restored = decompress<EliasGamma, uint32_t>(compressed);
// Or use packed container directly
PackedContainer<uint32_t, EliasGamma> container;
for (uint32_t value : data) {
container.push_back(value);
}
// Access like normal container
uint32_t first = container[0];
std::sort(container.begin(), container.end());
Design Philosophy
🎯 Compression as Type Property: Not an external operation, but intrinsic to the type
0️⃣ Zero-Copy Invariant: In-memory = wire format
🧩 Composability: Prefix-free codes enable natural composition
📐 Mathematical Rigor: Based on universal coding theory
🔧 Generic Programming: Follows Stepanov’s principles
⚡ Zero-Cost Abstractions: Compile-time optimizations
Resources
- Repository: github.com/queelius/pfc
- Whitepaper: papers/whitepaper.pdf (16 pages, academic format)
- Examples: See
examples/directory - Tests: Comprehensive test suite in
tests/
License
MIT
PFC: Where compression becomes intrinsic. Zero-copy data structures through prefix-free codes and generic programming.
Discussion