Packed Containers: Zero-Waste Bit-Level Storage in C++

October 15, 2018 4 min read Updated: February 24, 2026

What if your containers wasted zero bits?

The Problem

C++ containers align everything to byte boundaries. A char takes 8 bits even if you only need 3. An enum with four variants burns a full byte when 2 bits would suffice. A vector<bool> is the standard library’s one concession to bit packing, but it only works for booleans and is widely considered a design mistake.

The waste adds up. If you are storing millions of small values (symbols from a finite alphabet, flags, small integers) the overhead from byte alignment can exceed the actual data.

I wanted a general solution: containers that pack arbitrary value types at the bit level, with zero wasted bits.

The Idea

Consider a value type \(X\) with elements that can be encoded as bit strings. A packed container \(P(X)\) stores these encodings contiguously in a bit array, with no alignment padding between elements.

The key abstraction is the codec, an object that encodes values to bits and decodes them back:

template <OutputIterator O>
O codec::encode(value_type x, O out);  // write bits, return next position

template <InputIterator I>
optional<value_type> codec::decode(I in);  // read bits, return value

Different codecs give different tradeoffs. A fixed-width codec assigns \(k\) bits to every value, giving O(1) random access via simple arithmetic. A variable-width codec (like a prefix-free code) can be more compact but requires an index for random access.

Fixed-Width Packed Arrays

The simplest case. Every element gets exactly \(k\) bits. The underlying storage is a bit matrix where each row is one element:

Index	Element	Encoding
0	\(x_1\)	\(\{0,1\}^k\)
1	\(x_2\)	\(\{0,1\}^k\)
2	\(x_3\)	\(\{0,1\}^k\)

Random access is O(1): element \(i\) starts at bit \(i \cdot k\). No pointer chasing, no indirection. Just arithmetic.

For example, suppose you have a set of three symbols \(\{a, b, c\}\). A fixed-width code needs \(\lceil \log_2 3 \rceil = 2\) bits per symbol. An array of a million symbols takes 250 KB instead of 1 MB. That is a 4x reduction from one simple idea.

Variable-Width Packed Arrays

When values have wildly different frequencies, fixed-width codes waste bits on common values. A prefix-free code assigns shorter codes to frequent values:

Symbol	Code
`a`	`0`
`b`	`10`
`c`	`110`

Now the packed array stores variable-length encodings back-to-back. The catch: you can no longer jump to element \(i\) with simple arithmetic. You need an index, a separate array of bit addresses:

Index	Bit Address	Encoding
0	0	\(\{0,1\}^{n_1}\)
1	\(n_1\)	\(\{0,1\}^{n_2}\)
2	\(n_1 + n_2\)	\(\{0,1\}^{n_3}\)

The index costs \(O(n \log m)\) bits where \(m\) is the total bit length, but the savings on the data itself can more than compensate.

If the array is immutable, the entire bit sequence can be one contiguous allocation with padding only on the last byte. This is cache-friendly and compact.

Implementation

The library is header-only C++ with a few core components:

bit_array and bit_matrix for raw bit storage and 2D bit access
packed_array for fixed-width packed containers with O(1) random access
variable_length_packed_array for variable-width with indexed access
packed_list and packed_map for linked and associative variants
Codec, RegularType, and Iterator concepts defining the requirements

The design follows Stepanov’s approach: algorithms are parameterized over concepts, not concrete types. Any type with a conforming codec can be packed.

What This Is Really About

This is an exercise in taking information theory seriously at the systems level. Shannon tells us the minimum bits needed to represent a source. Standard containers ignore this completely. Packed containers close the gap.

It is also an exercise in C++ generic programming. The codec abstraction lets you swap encoding strategies without changing container code. Fixed-width, Huffman, arithmetic: the container does not care.

The code is on GitHub. It is experimental. I built it to explore the ideas, not to ship a production library. But the ideas are solid, and the implementation is a decent proof of concept.

The Problem

The Idea

Fixed-Width Packed Arrays

Variable-Width Packed Arrays

Implementation

What This Is Really About

Related Posts

Duality: The Hidden Structure of Opposites

The Motivating Example …

Seeing Structure First

The Question Behind the Code

Runtime Polymorphism Without Inheritance

The Problem

Numerical Integration with Generic Concepts

Overview

Teaching Linear Algebra with C++20 Concepts

Discussion