Architecture¶

Deep dive into ZeroIPC's internal design and implementation.

Overview¶

ZeroIPC is built on three fundamental pillars:

Binary Format Specification - Language-agnostic memory layout
Lock-Free Algorithms - High-performance concurrent operations
Minimal Metadata - Maximum flexibility through duck typing

Architecture Documents¶

Binary Format ¶

Complete specification of the on-disk/in-memory format:

Table header structure
Table entry layout
Data structure formats
Alignment requirements
Version compatibility

All language implementations must follow this specification exactly.

Design Principles ¶

Core philosophy and trade-offs:

Language Equality - No primary language
Minimal Overhead - Only essential metadata
User Responsibility - Trust users for type consistency
Zero Dependencies - Each implementation stands alone
Binary Compatibility - Strict format adherence

Lock-Free Implementation ¶

How concurrent structures work without locks:

Compare-And-Swap (CAS) operations
Memory ordering and fences
ABA problem prevention
Progress guarantees
Performance characteristics

Memory Layout ¶

How data is organized in shared memory:

Segment structure
Table organization
Data structure placement
Alignment and padding
Growth and allocation

Testing Strategy ¶

Comprehensive testing approach:

Unit tests per structure
Integration tests (cross-language)
Stress tests (high concurrency)
Property-based testing
Test categorization (FAST/MEDIUM/SLOW/STRESS)

Key Concepts¶

Shared Memory Model¶

┌─────────────────────────────────────┐
│        POSIX Shared Memory          │
│         (/dev/shm/name)             │
├─────────────────────────────────────┤
│          Table Header               │
│  (magic, version, count, offset)    │
├─────────────────────────────────────┤
│         Table Entries               │
│ (name, offset, size) × max_entries  │
├─────────────────────────────────────┤
│       Data Structures               │
│  ┌──────────────────────────────┐  │
│  │      Array Header            │  │
│  │      Array Data              │  │
│  └──────────────────────────────┘  │
│  ┌──────────────────────────────┐  │
│  │      Queue Header            │  │
│  │      Queue Buffer            │  │
│  └──────────────────────────────┘  │
│  ...                                │
└─────────────────────────────────────┘

Lock-Free Queue Algorithm¶

High-level overview of lock-free enqueue:

1. Load current tail index (relaxed)
2. Calculate next tail index
3. Check if queue full
4. CAS tail to next index
   - If successful: write data, fence
   - If failed: retry from step 1

This avoids locks while maintaining consistency across threads/processes.

Type System¶

ZeroIPC uses structural typing (duck typing):

No type metadata stored in shared memory
Users specify types at access time
Binary layout must match between languages
Size checking only validation performed

Example:

// C++ creates with int32_t
Array<int32_t> arr(mem, "data", 100);

# Python must match with np.int32
arr = Array(mem, "data", dtype=np.int32)

If types mismatch in size, behavior is undefined!

Performance Characteristics¶

Operation Complexity¶

Operation	Time Complexity	Lock-Free
Array index	O(1)	N/A
Array atomic op	O(1)	Yes
Queue enqueue	O(1) amortized	Yes
Queue dequeue	O(1) amortized	Yes
Stack push	O(1)	Yes
Stack pop	O(1)	Yes
Map insert	O(1) average	Yes
Map lookup	O(1) average	Yes

Memory Overhead¶

Structure	Metadata Size	Notes
Memory	16 bytes + table	Table size = 16 + 40*max_entries
Array	8 bytes	Just capacity
Queue	24 bytes	Head, tail, capacity
Stack	16 bytes	Top, capacity
Map	Variable	Bucket count dependent

Design Decisions¶

Why No Type Metadata?¶

Advantages: - Maximum flexibility across languages - Smaller memory footprint - Simpler implementation - Faster allocation

Trade-offs: - Users must track types externally - No runtime type checking - Potential for type mismatches

Why No Deallocation?¶

Advantages: - Simpler allocation (bump pointer) - No fragmentation - No free list management - Faster creation

Trade-offs: - Cannot reclaim individual structures - Must delete entire segment to free memory - Poor fit for dynamic workloads

Why Lock-Free?¶

Advantages: - No deadlocks - No priority inversion - Scalable across cores - Progress guarantees

Trade-offs: - More complex implementation - May retry on high contention - Larger instruction footprint - Requires atomic operations

Implementation Notes¶

C++ Implementation¶

Header-only for zero link-time overhead
Templates for compile-time type safety
Concepts for type constraints
RAII for automatic cleanup

Python Implementation¶

Pure Python for portability
NumPy for performance
mmap for direct access
ctypes for atomics

C Implementation¶

Static library for linking
Explicit management (no RAII)
Minimal dependencies (just POSIX)
Maximum portability

Next Steps¶

Binary Format - Detailed format specification
Lock-Free Implementation - Algorithm deep dive
Testing Strategy - How we ensure correctness
Design Principles - Philosophy and trade-offs