Skip to main content

Packed Containers: Zero-Waste Bit-Level Storage in C++

What if your containers wasted zero bits?

The Problem

C++ containers align everything to byte boundaries. A char takes 8 bits even if you only need 3. An enum with four variants burns a full byte when 2 bits would suffice. A vector<bool> is the standard library’s one concession to bit packing, but it only works for booleans and is widely considered a design mistake.

The waste adds up. If you’re storing millions of small values—symbols from a finite alphabet, flags, small integers—the overhead from byte alignment can exceed the actual data.

I wanted a general solution: containers that pack arbitrary value types at the bit level, with zero wasted bits.

The Idea

Consider a value type \(X\) with elements that can be encoded as bit strings. A packed container \(P(X)\) stores these encodings contiguously in a bit array, with no alignment padding between elements.

The key abstraction is the codec—an object that encodes values to bits and decodes them back:

template <OutputIterator O>
O codec::encode(value_type x, O out);  // write bits, return next position

template <InputIterator I>
optional<value_type> codec::decode(I in);  // read bits, return value

Different codecs give different tradeoffs. A fixed-width codec assigns \(k\) bits to every value, giving O(1) random access via simple arithmetic. A variable-width codec (like a prefix-free code) can be more compact but requires an index for random access.

Fixed-Width Packed Arrays

The simplest case. Every element gets exactly \(k\) bits. The underlying storage is a bit matrix where each row is one element:

IndexElementEncoding
0\(x_1\)\(\{0,1\}^k\)
1\(x_2\)\(\{0,1\}^k\)
2\(x_3\)\(\{0,1\}^k\)

Random access is O(1): element \(i\) starts at bit \(i \cdot k\). No pointer chasing, no indirection. Just arithmetic.

For example, suppose you have a set of three symbols \(\{a, b, c\}\). A fixed-width code needs \(\lceil \log_2 3 \rceil = 2\) bits per symbol. An array of a million symbols takes 250 KB instead of 1 MB. That’s a 4x reduction from one simple idea.

Variable-Width Packed Arrays

When values have wildly different frequencies, fixed-width codes waste bits on common values. A prefix-free code assigns shorter codes to frequent values:

SymbolCode
a0
b10
c110

Now the packed array stores variable-length encodings back-to-back. The catch: you can no longer jump to element \(i\) with simple arithmetic. You need an index—a separate array of bit addresses:

IndexBit AddressEncoding
00\(\{0,1\}^{n_1}\)
1\(n_1\)\(\{0,1\}^{n_2}\)
2\(n_1 + n_2\)\(\{0,1\}^{n_3}\)

The index costs \(O(n \log m)\) bits where \(m\) is the total bit length, but the savings on the data itself can more than compensate.

If the array is immutable, the entire bit sequence can be one contiguous allocation with padding only on the last byte. This is cache-friendly and compact.

Implementation

The library is header-only C++ with a few core components:

  • bit_array and bit_matrix for raw bit storage and 2D bit access
  • packed_array for fixed-width packed containers with O(1) random access
  • variable_length_packed_array for variable-width with indexed access
  • packed_list and packed_map for linked and associative variants
  • Codec, RegularType, and Iterator concepts defining the requirements

The design follows Stepanov’s approach: algorithms are parameterized over concepts, not concrete types. Any type with a conforming codec can be packed.

What This Is Really About

This is an exercise in taking information theory seriously at the systems level. Shannon tells us the minimum bits needed to represent a source. Standard containers ignore this completely. Packed containers close the gap.

It’s also an exercise in C++ generic programming. The codec abstraction lets you swap encoding strategies without changing container code. Fixed-width, Huffman, arithmetic—the container doesn’t care.

The code is on GitHub. It’s experimental—I built it to explore the ideas, not to ship a production library. But the ideas are solid, and the implementation is a decent proof of concept.

Discussion