What if your containers wasted zero bits?
The Problem
C++ containers align everything to byte boundaries. A char takes 8 bits even if you only need 3. An enum with four variants burns a full byte when 2 bits would suffice. A vector<bool> is the standard library’s one concession to bit packing, but it only works for booleans and is widely considered a design mistake.
The waste adds up. If you’re storing millions of small values—symbols from a finite alphabet, flags, small integers—the overhead from byte alignment can exceed the actual data.
I wanted a general solution: containers that pack arbitrary value types at the bit level, with zero wasted bits.
The Idea
Consider a value type \(X\) with elements that can be encoded as bit strings. A packed container \(P(X)\) stores these encodings contiguously in a bit array, with no alignment padding between elements.
The key abstraction is the codec—an object that encodes values to bits and decodes them back:
template <OutputIterator O>
O codec::encode(value_type x, O out); // write bits, return next position
template <InputIterator I>
optional<value_type> codec::decode(I in); // read bits, return value
Different codecs give different tradeoffs. A fixed-width codec assigns \(k\) bits to every value, giving O(1) random access via simple arithmetic. A variable-width codec (like a prefix-free code) can be more compact but requires an index for random access.
Fixed-Width Packed Arrays
The simplest case. Every element gets exactly \(k\) bits. The underlying storage is a bit matrix where each row is one element:
| Index | Element | Encoding |
|---|---|---|
| 0 | \(x_1\) | \(\{0,1\}^k\) |
| 1 | \(x_2\) | \(\{0,1\}^k\) |
| 2 | \(x_3\) | \(\{0,1\}^k\) |
Random access is O(1): element \(i\) starts at bit \(i \cdot k\). No pointer chasing, no indirection. Just arithmetic.
For example, suppose you have a set of three symbols \(\{a, b, c\}\). A fixed-width code needs \(\lceil \log_2 3 \rceil = 2\) bits per symbol. An array of a million symbols takes 250 KB instead of 1 MB. That’s a 4x reduction from one simple idea.
Variable-Width Packed Arrays
When values have wildly different frequencies, fixed-width codes waste bits on common values. A prefix-free code assigns shorter codes to frequent values:
| Symbol | Code |
|---|---|
a | 0 |
b | 10 |
c | 110 |
Now the packed array stores variable-length encodings back-to-back. The catch: you can no longer jump to element \(i\) with simple arithmetic. You need an index—a separate array of bit addresses:
| Index | Bit Address | Encoding |
|---|---|---|
| 0 | 0 | \(\{0,1\}^{n_1}\) |
| 1 | \(n_1\) | \(\{0,1\}^{n_2}\) |
| 2 | \(n_1 + n_2\) | \(\{0,1\}^{n_3}\) |
The index costs \(O(n \log m)\) bits where \(m\) is the total bit length, but the savings on the data itself can more than compensate.
If the array is immutable, the entire bit sequence can be one contiguous allocation with padding only on the last byte. This is cache-friendly and compact.
Implementation
The library is header-only C++ with a few core components:
bit_arrayandbit_matrixfor raw bit storage and 2D bit accesspacked_arrayfor fixed-width packed containers with O(1) random accessvariable_length_packed_arrayfor variable-width with indexed accesspacked_listandpacked_mapfor linked and associative variants- Codec, RegularType, and Iterator concepts defining the requirements
The design follows Stepanov’s approach: algorithms are parameterized over concepts, not concrete types. Any type with a conforming codec can be packed.
What This Is Really About
This is an exercise in taking information theory seriously at the systems level. Shannon tells us the minimum bits needed to represent a source. Standard containers ignore this completely. Packed containers close the gap.
It’s also an exercise in C++ generic programming. The codec abstraction lets you swap encoding strategies without changing container code. Fixed-width, Huffman, arithmetic—the container doesn’t care.
The code is on GitHub. It’s experimental—I built it to explore the ideas, not to ship a production library. But the ideas are solid, and the implementation is a decent proof of concept.
Discussion