A random oracle is a function $\mathcal{O}: {0,1}^* \to {0,1}^\infty$ where each output bit is independently and uniformly random, but the function is consistent – same input, same output, every time (Bellare & Rogaway, 1993).
You can’t build one. But you can build three approximations, each illuminating a different tradeoff.
Approximation 1: True Randomness, Cached
class OracleDigest:
"""Infinite digest backed by true randomness and a cache."""
def __init__(self, entropy=None):
self._entropy = entropy or (lambda: hashlib.sha256(os.urandom(32)).digest())
self._cache = {}
def __getitem__(self, index):
if index not in self._cache:
self._cache[index] = self._entropy()[0]
return self._cache[index]
Each new index draws a fresh random byte from the OS entropy pool and caches it. Genuinely random, but the cache grows without bound, can’t be serialized, and dies with the process. Each instance is a different random oracle from the family of all possible oracles.
Approximation 2: Deterministic, Infinite
class LazyDigest:
"""Infinite deterministic digest via counter-mode PRF: h(seed || i)."""
def __init__(self, seed, hash_fn=hashlib.sha256):
self._seed = seed
self._hash_fn = hash_fn
def __getitem__(self, index):
h = self._hash_fn()
h.update(self._seed)
h.update(index.to_bytes(8, "big"))
return h.digest()[0]
Counter-mode PRF: $\text{byte}_i = H(\text{seed} | i)[0]$. Same seed = same infinite sequence, on any machine, forever. $O(1)$ space, random access to any position. Only as “random” as the seed and hash function.
Note the fixed-width index encoding (to_bytes(8, 'big')). If you used str(index).encode() instead, index 10 and the concatenation of index 1 + index 0 would collide. Real KDFs like HKDF use fixed-width encodings for the same reason.
Approximation 3: Full Oracle
class OracleHash:
"""Approximates {0,1}* -> {0,1}^inf via hash + lazy expansion."""
def __init__(self, hash_fn=hashlib.sha256):
self._hash_fn = hash_fn
def __call__(self, x):
return LazyDigest(self._hash_fn(x).digest(), self._hash_fn)
Composes the two: hash the input to get a seed, expand with counter-mode PRF. Type: ${0,1}^* \to {0,1}^\infty$.
oracle = OracleHash()
d = oracle(b"hello")
print(bytes(d[i] for i in range(16)).hex()) # 4c9460cf7b89806e9ebc7b94c815d8c0
The Tradeoff
OracleDigest | LazyDigest | OracleHash | |
|---|---|---|---|
| Randomness | True (OS entropy) | Pseudo (PRF) | Pseudo (PRF) |
| Space | $O(k)$ queries | $O(1)$ | $O(1)$ |
| Serializable | No | Yes (seed) | Yes (hash fn) |
| Lifetime | Process-scoped | Permanent | Permanent |
Why This Matters
Random oracles are the backbone of the random oracle model (ROM), used throughout cryptography to prove security. The Fiat-Shamir transform – which turns interactive proofs into non-interactive signatures – is the most important example: replace the verifier’s random challenge with $H(\text{transcript})$.
The gap between a random oracle and a real hash function is real. Canetti, Goldreich & Halevi (2004) showed there exist schemes provably secure in the ROM that break when instantiated with any hash function. In practice, it usually works. Understanding why starts with understanding what you’re approximating.
Discussion