Bernoulli Data Type
Welcome to the Bernoulli Data Type documentation. This project presents a general framework for understanding and constructing probabilistic data structures with controlled error rates.
Overview
The Bernoulli data type concept provides a formalism for reasoning about approximations of data structures that trade space and time complexity for bounded error rates. This framework unifies many existing probabilistic data structures (like Bloom filters) and enables the systematic construction of new ones.
Key Concepts
Bernoulli Types are approximations of standard data types where operations may produce errors with known probability bounds. For any type T
, we denote its Bernoulli approximation as bernoulli<T>
.
Error Models are characterized by: - Order: The number of independent error parameters - Rank: The fundamental inferability of latent values - Entropy: The information-theoretic uncertainty
Applications include: - Space-efficient probabilistic data structures - Oblivious computing and privacy-preserving algorithms - Approximate query processing with bounded error guarantees - Trade-offs between accuracy and resource consumption
Quick Start
Theory
Start with the Bernoulli Model Overview to understand the foundational concepts. Then explore specific types:
- Bernoulli Boolean - The simplest Bernoulli type, modeling noisy boolean values
- Bernoulli Set - Probabilistic set membership (includes Bloom filters)
- Bernoulli Map - Approximate functions with bounded error
- Bernoulli Tuple - Product types with approximation
- Bernoulli Unit - Degenerate case with no uncertainty
- Bernoulli Void - Empty type analysis
Implementation
Explore practical aspects:
- Codecs - Encoding and decoding strategies
- Regular Types - Why Bernoulli types violate regularity
Building the Project
# Configure build with tests
cmake -B build -DBERNOULLI_BUILD_TESTS=ON -DBERNOULLI_BUILD_EXAMPLES=ON
# Build
cmake --build build
# Run tests
cd build && ctest --output-on-failure
Documentation
This documentation is built with MkDocs using the Material theme.
# Install MkDocs with Material theme
pip install mkdocs-material
# Serve documentation locally
mkdocs serve
# Build static site
mkdocs build
Project Structure
bernoulli_data_type/
├── include/ # Header-only C++ library
│ ├── bernoulli_bool/
│ ├── bernoulli_set/
│ ├── bernoulli_map/
│ └── ...
├── tests/ # Test suite
├── docs/ # Documentation
│ ├── theory/ # Theoretical foundations
│ ├── implementation/ # Implementation guides
│ └── api/ # API reference (Doxygen)
├── papers-latex/ # Academic papers (LaTeX sources)
└── research/ # Research notes and experiments
Contributing
This project is under active development. See the GitHub repository for the latest updates.
License
See the LICENSE file in the repository for licensing information.
About
The Bernoulli data type framework was developed to provide a rigorous foundation for probabilistic data structures and oblivious computing. It generalizes concepts from information theory, binary classification, and type theory into a unified framework for approximate computation.