Bloom Filters

March 15, 2015 2 min read Updated: February 24, 2026

One of the best ideas I ran into during my CS masters work is the Bloom filter, a data structure that gives you probabilistic membership testing with extraordinary space efficiency.

What a Bloom Filter Tells You

A Bloom filter answers one question: is this element in the set? It gives you two kinds of answers:

Definitely not in the set (100% certain)
Probably in the set (with a tunable false positive rate)

This asymmetry is the point. You trade perfect recall for massive space savings. For many applications, that is exactly the right tradeoff.

How It Works

You allocate an array of m bits, all initialized to 0. You choose k independent hash functions. To insert an element, hash it with all k functions and set those k bit positions to 1. To query, hash the element with the same k functions and check if all k positions are 1. If any position is 0, the element is definitely not in the set. If all are 1, it is probably in the set (but could be a false positive from hash collisions).

Why This Matters

In encrypted search systems (my thesis area), space efficiency is critical. When you are building secure indices, every bit counts. Bloom filters let you:

Test membership in O(k) time with k hash functions
Use dramatically less space than hash tables
Tune false positive rates by adjusting filter size

The math is clean: for n items and m bits, with k optimal hash functions, the false positive rate is approximately (1 - e^(-kn/m))^k. No hidden complexity. You pick the false positive rate you can tolerate and the formula tells you how much space you need.

The Broader Lesson

Bloom filters taught me something important about system design: sometimes “approximately correct” is more valuable than “exactly correct but expensive.”

This principle shows up everywhere:

Caching strategies
Distributed systems
Privacy-preserving protocols

Knowing when approximation is not just acceptable but superior is a useful skill. I kept coming back to this idea throughout my thesis work on encrypted search.

What a Bloom Filter Tells You

How It Works

Why This Matters

The Broader Lesson

Related Posts

Seeing Structure First

The Question Behind the Code

Alexander Stepanov: Efficient Programming with Components (A9 Lectures)

Notes

Alexander Stepanov: Notes on Programming

Notes

Algorithm Design

Algorithms

Discussion