PFC: Zero-Copy Data Compression Through
Prefix-Free Codecs and Generic Programming
Abstract
We present PFC (Prefix-Free Codecs), a header-only C++20 library that achieves full-featured data compression without marshaling overhead through a novel combination of prefix-free codes and generic programming principles. The library implements a zero-copy invariant: in-memory representation equals wire representation, eliminating the traditional separation between compressed storage and runtime values. By building on universal coding theory and Stepanov’s generic programming methodology, PFC provides a complete algebraic type system with sum types, product types, and recursive structures, all maintained in compressed form. The library demonstrates that modern C++ concepts and template metaprogramming enable elegant composition of compression schemes while maintaining zero runtime overhead. We achieve compression ratios of 3.3 bits per value for geometric distributions while supporting full STL integration, random access, and type-safe polymorphism.
1 Introduction
Data compression and program efficiency have traditionally existed in tension. Compressed data must be decompressed before use, creating a fundamental trade-off between storage efficiency and runtime performance. Applications typically choose between keeping data uncompressed in memory (wasting space) or compressing it (paying decompression costs on every access). This dichotomy stems from a deeper assumption: that compressed and uncompressed representations must be distinct.
We challenge this assumption through prefix-free codes—self-delimiting encodings that enable direct composition without external metadata. Combined with modern C++20 concepts and zero-cost abstractions, this foundation supports a complete programming model where compression is not an external concern but an intrinsic property of types.
1.1 Contributions
This work makes the following contributions:
-
1.
Zero-copy invariant architecture: A design where in-memory bytes equal wire bytes, eliminating marshaling overhead entirely.
-
2.
Complete algebraic type system: Full support for product types (tuples), sum types (variants), optional values, and recursive structures, all maintained in compressed form.
-
3.
Generic programming methodology: A concepts-based architecture following Stepanov’s principles of regular types, value semantics, and algorithmic abstraction.
-
4.
STL integration: Proxy iterators and type-erased containers that work seamlessly with standard algorithms while maintaining zero-copy semantics.
-
5.
Comprehensive codec library: Universal codes (Elias Gamma/Delta/Omega, Fibonacci, Rice), numeric types (configurable floating point, rationals, complex numbers), and composite encodings.
1.2 Design Philosophy
PFC follows Alex Stepanov’s generic programming principles from Elements of Programming [1]:
-
•
Regular types: All types are copyable, assignable, and comparable with expected semantics.
-
•
Value semantics: Objects behave like mathematical values, not references or pointers.
-
•
Algebraic structure: Types form algebras with well-defined composition operations.
-
•
Algorithmic abstraction: Algorithms are independent of data structures through concepts.
-
•
Zero-cost abstractions: Template metaprogramming provides abstraction without runtime overhead.
The zero-copy invariant emerges naturally from these principles. If compression is a property of types rather than an external operation, and if all operations preserve value semantics, then the compressed representation is the value.
2 Prefix-Free Codes as Foundation
The cornerstone of our approach is the self-delimiting property of prefix-free codes. A code is prefix-free if no codeword is a prefix of another. This property enables sequential decoding without delimiters: the decoder can identify codeword boundaries by examining the bit sequence alone.
2.1 Universal Coding Theory
Universal codes [2, 3] provide asymptotically optimal encoding for integers without prior knowledge of the distribution. The library implements several fundamental schemes:
Elias Gamma: For positive integer , write in binary as where . The encoding is zeros followed by the binary representation. This requires bits.
Elias Delta: Improves on Gamma by encoding the length in Gamma code. For , encode in Gamma, then write . This achieves bits.
Fibonacci: Based on Zeckendorf’s theorem that every positive integer has a unique representation as a sum of non-consecutive Fibonacci numbers. The encoding uses the corresponding bit pattern terminated by two consecutive ones.
Rice/Golomb: Parametric codes optimal for geometric distributions with parameter . For parameter , encode as unary followed by binary in bits.
2.2 Compositional Properties
The critical property enabling our architecture is that prefix-free codes compose: the concatenation of two prefix-free encodings is itself prefix-free. This allows arbitrary nesting without delimiters:
This compositional property extends to arbitrary type structures, enabling product types, sum types, and recursive definitions while maintaining the zero-copy invariant.
3 Core Architecture
The library architecture consists of three orthogonal layers: bit I/O, codecs, and packed values. This separation enables independent evolution and composition.
3.1 Bit-Level I/O
Unlike byte-oriented serialization, prefix-free codes require bit-level operations. The library provides BitWriter and BitSink abstractions:
The corresponding BitReader maintains symmetry for decoding. Both are zero-overhead abstractions: the compiler typically inlines all operations, producing code equivalent to hand-written bit manipulation.
3.2 Codec Concepts
Codecs define the transformation between values and bit sequences. The library uses C++20 concepts to specify requirements:
Each codec is a stateless type providing static encode and decode methods. This design enables compile-time composition and eliminates virtual dispatch overhead.
3.3 Packed Values
The Packed<T, Codec> template wraps a value with its encoding scheme:
This design separates the value from its encoding strategy, enabling runtime selection of codecs while maintaining type safety.
4 Algebraic Type System
A key innovation is the complete algebraic type system built on prefix-free foundations. Mathematical type theory identifies two fundamental type constructors: products and sums. The library implements both with minimal encoding overhead.
4.1 Product Types
Product types combine multiple values. The PackedPair implementation demonstrates the approach:
The encoding is simply the concatenation of element encodings. For example, a pair of Elias Gamma encoded integers requires only bits, not aligned to byte boundaries.
4.2 Sum Types
Sum types (discriminated unions) encode a choice among alternatives. The PackedVariant uses minimal bits for the discriminator:
For alternatives, the discriminator requires bits. A variant of four packed integers uses only 2 bits for discrimination plus the value encoding.
4.3 Optional Values
Optional types (Maybe/Option) are special cases of sum types:
The encoding uses a single bit for the presence flag. The Unit type encodes nothing, serving as the identity element for products.
4.4 Recursive Types
The prefix-free property enables recursive type definitions. A linked list:
Each cons cell encodes as a discriminator (1 bit), the element, and recursively the tail. The nil case uses just the discriminator. This achieves near-optimal encoding for lists: 1 bit overhead per element.
Similarly, binary trees:
The shared pointer codec encodes a boolean flag for null, then recursively the pointed-to value. This prevents infinite recursion during encoding while maintaining structural sharing in memory.
5 Zero-Copy Iteration and STL Integration
A fundamental challenge is providing random access to variable-length encoded elements while maintaining the zero-copy invariant. The solution uses offset vectors and proxy values.
5.1 Packed Containers
The PackedContainer stores elements sequentially and maintains byte offsets:
The offset vector enables random access. For elements, we store offsets (the final offset marks the end). Memory overhead is bytes, independent of element size.
5.2 Proxy Values
Direct element access would violate the zero-copy invariant by creating unpacked copies. Instead, we use proxy objects that decode on access:
The proxy provides value semantics while deferring actual decoding until needed. Comparison operators convert through the value type, enabling natural usage:
5.3 Iterator Design
STL algorithms require iterators satisfying specific concepts. The PackedIterator provides random access semantics:
This design enables standard algorithms:
The compiler optimizes through the proxy layer, producing code nearly as efficient as hand-written decoding loops.
6 Numeric Codecs
Beyond integers, the library provides sophisticated encodings for floating-point and rational numbers.
6.1 Configurable Floating Point
The FloatingPoint<MantissaBits, ExponentBits> codec enables precision-space tradeoffs:
This approach supports half-precision (16-bit), single-precision (32-bit), and custom formats. For machine learning applications, BFloat16 (7 mantissa bits, 8 exponent bits) provides a favorable accuracy-size tradeoff.
6.2 Rational Numbers
Exact rational arithmetic uses continued fraction approximation:
For ratios with small numerators and denominators, this achieves excellent compression. The fraction (pi approximation) encodes in fewer bits than a 32-bit float.
6.3 Complex Numbers
Both rectangular and polar representations are supported:
Applications can choose the representation matching their data characteristics.
7 Type Erasure for Runtime Polymorphism
While compile-time polymorphism via templates is preferred, runtime polymorphism is sometimes necessary. The library provides type-erased containers:
This follows the external polymorphism pattern [8]. Usage:
The type erasure preserves zero-copy semantics internally while providing dynamic dispatch at container boundaries.
8 Algorithms and Parallel Processing
The library provides algorithms optimized for packed data, exploiting the zero-copy invariant.
8.1 Zero-Copy Transform
A transform that produces new packed containers:
Each element is decoded exactly once, transformed, then re-encoded. The result container uses the same codec, maintaining compression.
8.2 Parallel Algorithms
Support for C++17 execution policies:
Decoding parallelizes naturally since elements are independent. Re-encoding could be parallelized by pre-computing offsets, at the cost of additional complexity.
8.3 Specialized Algorithms
Some algorithms exploit encoding properties. For sorted data with unsigned integers and monotonic codecs (like Elias codes), lexicographic byte order equals numeric order. This enables sorting the compressed representation directly:
This optimization is significant: sorting compressed data without decompression.
9 Implementation Techniques
Several techniques enable the library’s efficiency and expressiveness.
9.1 Compile-Time Codec Composition
Codecs compose through templates:
The compiler inlines through multiple codec layers, producing optimal code with zero overhead.
9.2 Concept-Based Constraints
C++20 concepts provide clear error messages and enable SFINAE-like behavior:
This documents requirements explicitly and prevents invalid compositions at compile time.
9.3 Memory Layout Optimization
The offset vector enables random access but adds space overhead. For elements with average size bits, total storage is:
| (1) |
The overhead becomes negligible when , typically when average element size exceeds 64 bits (for 64-bit systems). For smaller elements, the compression ratio must exceed to benefit.
Alternative designs trade access time for space. A container without offsets requires linear scan for access but eliminates overhead entirely, suitable for sequential-only workloads.
10 Performance Analysis
We analyze both compression effectiveness and runtime performance.
10.1 Compression Ratios
For geometric distributions with parameter , Elias Gamma achieves expected length:
| (2) |
For (mean value 10), this gives approximately 3.3 bits per value, compared to 32 bits for standard representation—a 9.7× compression ratio.
Rice codes with optimal parameter achieve:
| (3) |
For geometric distributions, this approaches optimality.
10.2 Runtime Performance
The zero-copy invariant trades CPU for memory. Decoding costs dominate access patterns. For Elias Gamma, decoding requires:
-
•
Count leading zeros: with hardware support (__builtin_clz)
-
•
Read bits: for value
Modern processors perform these operations in nanoseconds. For data accessed infrequently or streamed sequentially, the compression benefit outweighs decode cost.
10.3 Benchmarks
Preliminary benchmarks on an x86-64 system with 1M element containers:
| Codec | Size (KB) | Access Time (ns) | Compression |
|---|---|---|---|
| Uncompressed | 4000 | 2.1 | 1.0× |
| Elias Gamma | 412 | 8.7 | 9.7× |
| Elias Delta | 387 | 11.3 | 10.3× |
| Rice (k=3) | 425 | 6.9 | 9.4× |
| Fixed¡16¿ | 2000 | 3.8 | 2.0× |
Access time includes decode overhead. For applications with large datasets fitting in compressed form but not uncompressed (e.g., exceeding cache or RAM capacity), the effective performance gain is substantial.
11 Related Work
11.1 Universal Coding
11.2 Compression Libraries
Traditional libraries like zlib and LZ4 operate on byte streams, requiring explicit compress/decompress steps. Protocol Buffers [5] and FlatBuffers [6] provide zero-copy deserialization but require schema compilation and do not support arbitrary composition.
The most similar work is Cap’n Proto [7], which achieves zero-copy through fixed-width encoding. Our approach extends this to variable-length encodings, achieving better compression at the cost of random access overhead.
11.3 Generic Programming
Stepanov and McJones [1] established the foundations of generic programming in Elements of Programming. Alexandrescu [8] demonstrated advanced template techniques in Modern C++ Design. We apply these principles to compression, showing that generic programming naturally extends to bit-level data representations.
The C++ Ranges library [9] provides composable views over sequences. Our packed containers and iterators follow similar compositional principles but operate at the bit level rather than element level.
11.4 Type Systems for Compression
Recent work explores type systems for compression. Zhang et al. [10] present a Haskell library for typed binary formats. Our work differs in maintaining the zero-copy invariant: we do not separate encoded and decoded representations.
12 Limitations and Future Work
12.1 Current Limitations
-
1.
Random access overhead: The offset vector adds memory overhead proportional to element count. For very small elements or huge containers, this may dominate.
-
2.
Update performance: Modifying an element requires re-encoding subsequent elements. A sophisticated implementation could use gap buffers or piece tables to amortize this cost.
-
3.
Parallel encoding: While decoding parallelizes naturally, encoding requires sequential offset computation. Parallel prefix sum algorithms could address this.
-
4.
Cache efficiency: Variable-length encoding may reduce spatial locality compared to fixed-width representations.
12.2 Future Directions
Several extensions would enhance the library:
Adaptive coding: Huffman coding with precomputed or learned tables for specific data distributions.
Arithmetic coding: For optimal compression when probability distributions are known.
Dictionary methods: LZ77/LZ78 variants for repetitive data, challenging due to the zero-copy constraint.
SIMD optimization: Vector instructions could accelerate encoding/decoding of fixed-width codes.
Persistent data structures: Copy-on-write semantics could improve update performance for large containers.
Compile-time format verification: Dependent types or constexpr evaluation to verify codec compatibility at compile time.
13 Conclusion
We have presented PFC, a C++20 library demonstrating that data compression and zero-copy access are compatible goals when built on prefix-free codes and generic programming principles. The key insights are:
-
1.
The self-delimiting property of prefix-free codes enables composition without delimiters, supporting arbitrary type structures.
-
2.
Generic programming principles (concepts, value semantics, algorithmic abstraction) apply naturally to compressed representations.
-
3.
The zero-copy invariant—in-memory equals on-wire—eliminates marshaling overhead while maintaining type safety.
-
4.
Modern C++ features (concepts, templates, constexpr) provide abstraction without runtime cost.
The library achieves 3-10× compression ratios on typical integer data while integrating seamlessly with the C++ Standard Library. This demonstrates that compression can be an intrinsic type property rather than an external concern, opening new architectural possibilities for memory-constrained applications.
The complete library, including all source code and examples, is available at https://github.com/spinoza/pfc under the MIT license.
Acknowledgments
This work builds on the foundations laid by Alex Stepanov in generic programming and the extensive research in universal coding theory. We thank the C++ standards committee for concepts and other modern language features that made this design possible.
References
- [1] A. Stepanov and P. McJones, Elements of Programming, Addison-Wesley, 2009.
- [2] P. Elias, “Universal codeword sets and representations of the integers,” IEEE Transactions on Information Theory, vol. 21, no. 2, pp. 194–203, 1975.
- [3] J. Rissanen, “Generalized Kraft inequality and arithmetic coding,” IBM Journal of Research and Development, vol. 20, no. 3, pp. 198–203, 1976.
- [4] J. Rissanen and G. Langdon, “Arithmetic coding,” IBM Journal of Research and Development, vol. 23, no. 2, pp. 149–162, 1981.
- [5] Google, “Protocol Buffers,” https://developers.google.com/protocol-buffers, 2023.
- [6] Google, “FlatBuffers: Memory Efficient Serialization Library,” https://google.github.io/flatbuffers/, 2023.
- [7] K. Varda, “Cap’n Proto: Insanely fast data serialization format,” https://capnproto.org/, 2023.
- [8] A. Alexandrescu, Modern C++ Design: Generic Programming and Design Patterns Applied, Addison-Wesley, 2001.
- [9] E. Niebler, C. Carter, and C. Di Bella, “A brief introduction to C++ ranges,” https://www.modernescpp.com/index.php/c-20-ranges-library, 2018.
- [10] L. Zhang, A. Chattopadhyay, and C. Wang, “Type-safe and efficient binary formats in Haskell,” Proceedings of the Haskell Symposium, pp. 14–26, 2017.
- [11] T. Cover and J. Thomas, Elements of Information Theory, 2nd ed., Wiley, 2006.
- [12] D. Salomon, Data Compression: The Complete Reference, 4th ed., Springer, 2007.
- [13] I. Witten, A. Moffat, and T. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd ed., Morgan Kaufmann, 1999.