Skip to content

Bernoulli Data Type

Welcome to the Bernoulli Data Type documentation. This project presents a general framework for understanding and constructing probabilistic data structures with controlled error rates.

Overview

The Bernoulli data type concept provides a formalism for reasoning about approximations of data structures that trade space and time complexity for bounded error rates. This framework unifies many existing probabilistic data structures (like Bloom filters) and enables the systematic construction of new ones.

Key Concepts

Bernoulli Types are approximations of standard data types where operations may produce errors with known probability bounds. For any type T, we denote its Bernoulli approximation as bernoulli<T>.

Error Models are characterized by: - Order: The number of independent error parameters - Rank: The fundamental inferability of latent values - Entropy: The information-theoretic uncertainty

Applications include: - Space-efficient probabilistic data structures - Oblivious computing and privacy-preserving algorithms - Approximate query processing with bounded error guarantees - Trade-offs between accuracy and resource consumption

Quick Start

Theory

Start with the Bernoulli Model Overview to understand the foundational concepts. Then explore specific types:

Implementation

Explore practical aspects:

  • Codecs - Encoding and decoding strategies
  • Regular Types - Why Bernoulli types violate regularity

Building the Project

# Configure build with tests
cmake -B build -DBERNOULLI_BUILD_TESTS=ON -DBERNOULLI_BUILD_EXAMPLES=ON

# Build
cmake --build build

# Run tests
cd build && ctest --output-on-failure

Documentation

This documentation is built with MkDocs using the Material theme.

# Install MkDocs with Material theme
pip install mkdocs-material

# Serve documentation locally
mkdocs serve

# Build static site
mkdocs build

Project Structure

bernoulli_data_type/
├── include/              # Header-only C++ library
│   ├── bernoulli_bool/
│   ├── bernoulli_set/
│   ├── bernoulli_map/
│   └── ...
├── tests/                # Test suite
├── docs/                 # Documentation
│   ├── theory/           # Theoretical foundations
│   ├── implementation/   # Implementation guides
│   └── api/              # API reference (Doxygen)
├── papers-latex/         # Academic papers (LaTeX sources)
└── research/             # Research notes and experiments

Contributing

This project is under active development. See the GitHub repository for the latest updates.

License

See the LICENSE file in the repository for licensing information.

About

The Bernoulli data type framework was developed to provide a rigorous foundation for probabilistic data structures and oblivious computing. It generalizes concepts from information theory, binary classification, and type theory into a unified framework for approximate computation.