Skip to main content

Entropy Maps: When Hashing Meets Information Theory

What if you could implement any function using nothing but a hash function and some clever encoding? No lookup tables. No stored keys. Just hashing.

This is the core idea behind entropy maps—and the mathematics tells us exactly how efficient they can be.

The Insight

An entropy map implements a function $f: X \to Y$ by:

  1. Hashing each input $x$ to get a bit string
  2. Checking if a prefix of that hash matches a code for some output $y$
  3. Returning the decoded output

The trick: we assign multiple prefix-free codes to each output value. High-probability outputs get more codes. The result? Space usage equals the Shannon entropy of the output distribution.

This isn’t just elegant—it’s optimal.

Why This Matters

Bloom filters are entropy maps in disguise. When you test membership in a Bloom filter, you’re really asking: “does this hash prefix decode to member or non-member?”

The entropy map lens reveals what’s really happening:

  • The “member” code has probability ε (the false positive rate)
  • The “non-member” code has probability 1-ε
  • Total space: the entropy of this distribution

This framework unifies Bloom filters, Count-Min sketches, and other probabilistic structures under one information-theoretic roof.

The Broader Pattern

Entropy maps taught me something fundamental: information theory tells us the limits of what’s possible, and hash functions let us approach those limits.

When you understand that space efficiency is really about entropy, you gain a new design principle. You’re not just building data structures—you’re doing lossy compression of functions.

This connects directly to Bernoulli types—my broader framework for approximate computing. The entropy map is the construction that makes the theory practical.


For the full mathematical treatment with confusion matrices and algorithms, see the detailed entropy maps post.

Discussion