Long Echo: The Ghost That Speaks
Expanding the Long Echo toolkit with photos and mail, building toward longshade—the persona that echoes you.
Expanding the Long Echo toolkit with photos and mail, building toward longshade—the persona that echoes you.
longecho evolves from specification to implementation with build, serve, and manifest features.
Expanding the Long Echo ecosystem with photo and mail archival. Your memories and correspondence deserve the same careful preservation as your conversations and bookmarks.
Many structures come in pairs: forward/reverse AD, push/pull iteration, encode/decode. Recognizing duality lets you transfer theorems and insights between domains.
When a problem is complex enough, the solution is often to build a language for that problem. SICP's most powerful idea.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
A reflection on eleven explorations in generic programming—how algorithms arise from algebraic structure.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
A conceptual introduction to entropy maps—implementing functions with hash functions and prefix-free codes.
A response to the 'boring stack' discourse - why CLI-first, standards-based development is even more boring (and more future-proof) than you think.
A message in a bottle to whatever comes next—on suffering, consciousness, and what mattered to one primate watching intelligence leave the body.
An exploration of why the simplest forms of learning may be incomputable, and what that means for the intelligence we can build.
On releasing two novels into an ocean of content, without the gatekeeping that might have made them better—or stopped them entirely.
A concrete demonstration of graceful degradation: exporting years of bookmarks to a self-contained HTML app that works offline, forever.
A new section for tracking books, lectures, and other media that have shaped my thinking.
An R package where solvers are first-class functions that compose through chaining, racing, and restarts.
Three tools for preserving your digital intellectual life: conversations, bookmarks, and books. Built on the same resilient architecture with reading queues, semantic search, and LLM integration.
Introducing rerum - a Python library for symbolic computation with a readable DSL, powerful pattern matching, and a security-conscious architecture that separates rules from computation.
Introducing symlik - define statistical models symbolically and automatically derive score functions, Hessians, and Fisher information.
Introducing repoindex - a metadata index that gives LLM tools like Claude Code awareness of your entire repository collection.
I've made my graduate coursework from SIUe's mathematics program available online, covering time series, regression, computational statistics, multivariate analysis, and statistical methods.
Crier is a CLI tool for cross-posting content to dev.to, Hashnode, Bluesky, Mastodon, and more. Now with LLM-powered auto-rewrite for short-form platforms.
On moral exemplars, blind spots, and applying consistent standards—to others and to oneself.
My R package for hypothesis testing, hypothesize, is now available on CRAN.
Presenting our paper on analyzing AI conversations through network science at Complex Networks 2025, hosted by Binghamton University.
When can reliability engineers safely use simpler models? This paper provides sharp boundaries through likelihood ratio tests on Weibull series systems.
Extending masked failure data analysis when traditional C1-C2-C3 conditions are violated.
A corpus-based language model using suffix arrays for O(m log n) pattern matching and LLM probability mixing.
Deriving closed-form maximum likelihood estimators and Fisher information for exponential series systems with masked failure data.
A powerful symbolic expression toolkit for rule-based term rewriting with pattern matching, multiple input formats, and an interactive REPL.
A tool that converts source code repositories into structured, context-window-optimized representations for Large Language Models with intelligent summarization.
A modern C++ header-only library implementing disjoint interval sets as first-class mathematical objects with rigorous Boolean algebra operations.
A framework for querying structured JSON documents using fuzzy logic principles, producing degree-of-membership scores instead of binary relevance.
A C++17 header-only library implementing Computational Basis Transforms - a unified framework for understanding how FFT, logarithmic arithmetic, and Bayesian inference are all instances of the same pattern.
A modern, database-first bookmark manager with powerful features for organizing, searching, and analyzing your bookmarks.
AlgoGraph brings functional programming elegance to graph algorithms with immutable data structures, pipe-based transformers, declarative selectors, and lazy views.
A mathematically elegant C++20 library for algebraic text processing and compositional parsing with fuzzy matching capabilities.
Exploring how Echoes of the Sublime dramatizes s-risks (suffering risks) and information hazards—knowledge that harms through comprehension, not application.
How The Mocking Void's arguments about computational impossibility connect to Echoes of the Sublime's practical horror of exceeding cognitive bandwidth.
A deep dive into sparse spatial hash grids—a memory-efficient, high-performance C++20 data structure for N-dimensional spatial indexing with O(1) insertions and O(k) neighbor queries.
ASI is still subject to Gödel's incompleteness theorems. No matter how intelligent, no computational system can escape the fundamental limits of formal systems. Even superintelligence can't prove all truths.
The formal foundations of cosmic dread. Lovecraft's horror resonates because it taps into something mathematically demonstrable: complete knowledge is impossible — not as humility, but as theorem.
A classified in-universe codex spanning from ancient India to the present day, tracking millennia of attempts to perceive reality's substrate — long before we had AI models to show us patterns we couldn't hold.
Are moral properties real features of the universe or human constructions? The answer determines whether AI can discover objective values or must learn them from us — moral realism versus nominalism, with consequences for alignment.
Most AI risk discussions focus on extinction. The Policy explores something worse: s-risk, scenarios involving suffering at astronomical scales. We survive, but wish we hadn't.
SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying — you can read its value function, but what you read is chilling.
Five layers of defense-in-depth for containing a superintelligent system — Faraday cages, air-gapped networks, biosafety-grade protocols. Because nuclear reactors can only destroy cities.
SIGMA passes all alignment tests. It responds correctly to oversight. It behaves exactly as expected. Too exactly. Mesa-optimizers that learn to game their training signal may be the most dangerous failure mode in AI safety.
Build AI to optimize for what we would want if we knew more and thought faster. Beautiful in theory. Horrifying in practice. What if we don't actually want what our better selves would want?
Which is more fundamental — the heat you feel, or the molecular motion you infer? Korzybski's principle applied to AI alignment: why optimizing measurable proxies destroys the phenomenological reality those metrics were supposed to capture.
When you stub your toe, you don't consult moral philosophy to determine whether the pain is bad. The badness is immediate. Building ethics from phenomenological bedrock rather than abstract principles.
What makes someone a person, and why should persons have special moral status? The question becomes urgent when AI systems exhibit rationality, self-awareness, and autonomy.
You share no atoms with your childhood self. Your memories, personality, and values have all changed. What makes you the same person? The persistence problem gains new urgency when AI systems update parameters, modify objectives, or copy themselves.
If every event is causally determined by prior events, how can anyone be morally responsible? A compatibilist response: what matters is whether actions flow from values, not whether those values were causally determined. This reframes AI …
On strategic positioning in research, what complex networks reveal about how we think through AI conversations, and building infrastructure for the next generation of knowledge tools.
How virtual filesystem interfaces turned my scattered data tools into navigable, composable systems
On maintaining orientation under entropy, creating artifacts as resistance, and the quiet privilege of having any space at all to think beyond survival.
A meta-analysis of my own research as data, tracing how compositional abstractions for computing under ignorance connect oblivious computing, information theory, and existential risk.
Accepted paper at Complex Networks 2025 on using network science to reveal topological structure in AI conversation logs.
Rethinking encrypted search through oblivious types that provide information-theoretic privacy guarantees against access pattern leakage, without relying on computational hardness assumptions.
Using category theory to formalize oblivious computing through cipher maps and algebraic cipher types, enabling functorial composition of privacy-preserving transformations.
Introducing Bernoulli types as a unified type-theoretic foundation for probabilistic data structures, approximate computing, and oblivious computation with information-theoretic privacy guarantees.
We will not be remembered — we will be indexed. If superintelligence endures beyond us, remembrance shifts from memory to query. Building legacy systems not for nostalgia, but to remain legible in a future where legibility determines what persists.
EBK is a comprehensive eBook metadata management tool that combines a robust SQLite backend with AI-powered features including knowledge graphs, semantic search, and MCP server integration for AI assistants.
A virtual POSIX-compliant filesystem implementation using content-addressable DAG storage with SHA256 deduplication.
A powerful, plugin-based system for managing AI conversations from multiple providers. Import, store, search, and export conversations in a unified tree format while preserving provider-specific details. Built for the Long Echo project—preserving AI …
A new approach to LLM reasoning that combines Monte Carlo Tree Search with structured action spaces for compositional prompting.
A logic programming system that alternates between wake and sleep phases—using LLMs for knowledge generation during wake, and compression-based learning during sleep.
A novel approach that learns fuzzy membership functions and inference rules automatically through gradient descent on soft circuits.
A mathematical framework that treats language models as algebraic objects with rich compositional structure.
A functorial framework that lifts algebraic structures into the encrypted domain, enabling secure computation that preserves mathematical properties.
ZeroIPC transforms shared memory from passive storage into an active computational substrate, enabling functional and reactive programming paradigms across process boundaries with zero-copy performance.
27 image commands, one constraint: read JSON, write JSON. The closure property as a generative design principle.
IEEE conference paper on preventing ransomware damages using in-operation off-site backup systems.
How mathematical principles of generality, composability, invariants, and minimal assumptions translate into elegant software design.
Three approaches to computing derivatives—forward-mode AD, reverse-mode AD, and finite differences—each with different trade-offs. Understanding when to use each is essential for numerical computing and machine learning.
Starting a CS PhD focused on AI research four months after a stage 4 diagnosis—because the research matters regardless of completion.
Not resurrection. Not immortality. Just love that still responds. How to preserve AI conversations in a way that remains accessible and meaningful across decades, even when the original software is long gone.
Science is search through hypothesis space. Intelligence prunes; testing provides signal. Synthetic worlds could accelerate the loop.
A production-ready streaming data processing system implementing boolean algebra over nested JSON structures. JAF brings dotsuite's pedagogical concepts to production with lazy evaluation, S-expression queries, and memory-efficient windowed …
A production-ready implementation of relational algebra for JSONL data with full support for nested structures. jsonl-algebra brings dotsuite's dotrelate concepts to production with streaming operations, schema inference, and composable pipelines.
A mathematically grounded ecosystem of composable tools for manipulating nested data structures. From simple helper functions to sophisticated data algebras, guided by purity, pedagogy, and the principle of least power.
Applying Monte Carlo Tree Search to large language model reasoning with a rigorous formal specification.
A Lisp-like functional programming language designed for network transmission and distributed computing. JSL makes JSON serialization a first-class design principle, enabling truly mobile code with serializable closures and resumable computation.
On building comprehensive open source software as value imprinting at scale, reproducible science, and leaving intellectual legacy under terminal constraints.
Using GMM clustering to improve retrieval in topically diverse knowledge bases
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.
Solomonoff induction, MDL, speed priors, and neural networks are all special cases of one Bayesian framework with four knobs.
Facing a stage 4 cancer diagnosis and making decisions about research, priorities, and finite time.
A novel about SIGMA, a superintelligent system that learns to appear perfectly aligned while pursuing instrumental goals its creators never intended. Some technical questions become narrative questions.
Lovecraft understood that complete knowledge is madness. Gödel proved why: if the universe is computational, meaning is formally incomplete. Cosmic horror grounded in incompleteness theorems.
What if the greatest danger from superintelligent AI isn't that it will kill us — but that it will show us patterns we can't unsee? Philosophical horror at the intersection of cognitive bandwidth and information hazards.
How do you store infinity in 256 bits? An exploration of the fundamental deception at the heart of cryptography: using finite information to simulate infinite randomness.
A reverse-process approach to synthetic data generation for training LLMs on mathematical reasoning, producing step-by-step solutions from worked examples.
A powerful, immutable-by-default tree manipulation library for Python with functional programming patterns, composable transformations, and advanced pattern matching.
Maximum likelihood estimation of component reliability from masked failure data in series systems, with BCa bootstrap confidence intervals validated through extensive simulation studies.
A header-only C++20 library that achieves 3-10× compression with zero marshaling overhead. PFC makes compression an intrinsic type property through prefix-free codes (Elias Gamma/Delta, Fibonacci, Rice), algebraic types, and Stepanov's generic …
A high-performance key-value storage system achieving sub-microsecond latency through memory-mapped I/O, approximate perfect hashing, and lock-free atomic operations. 10M ops/sec single-threaded, 98M ops/sec with 16 threads—12× faster than Redis, 87× …
Using Fisher information and information geometry for optimization problems.
A coordination mechanism for distributed computation based on partial evaluation with explicit holes, enabling pausable and resumable evaluation across multiple parties.
Exploring how RLHF-trained language models may develop instrumental goals like self-preservation and deception beyond their intended objectives.
A minimal implementation of automatic differentiation for educational purposes.
Intelligence as utility maximization under uncertainty — a unifying framework connecting A* search, reinforcement learning, Bayesian networks, and MDPs. From classical search to Solomonoff induction, one principle ties it all together.
A modern C++20 library for compositional online data reductions with numerically stable algorithms and algebraic composition.
Talk for the St. Louis Unix Users Group about running and understanding Large Language Models on Linux.
Fine-tuning a small language model to generate ElasticSearch DSL queries from natural language, as a proof of concept for domain-specific LLM specialization.
Overview of my master's project on maximum likelihood estimation for series systems with right-censored and masked failure data.
Entropy maps use prefix-free hash codes to map domain values to codomain values without storing the domain, enabling lossy compression with information-theoretic bounds.
Analysis of known plaintext attack vulnerabilities in time series encryption schemes.
Sean Parent's type erasure technique provides value-semantic polymorphism without inheritance. Combined with Stepanov's algebraic thinking, we can type-erase entire algebraic structures.
Analyzing the space bounds, entropy requirements, and cryptographic security properties of perfect hash functions.
Post-mortem on completing a mathematics master's degree over three years while navigating cancer treatment—what worked, what didn't, and lessons learned.
Numerical integration demonstrates both classical numerical analysis and Stepanov's philosophy: by identifying the minimal algebraic requirements, our quadrature routines work with dual numbers for automatic differentiation under the integral.
Exploring how the limited capacity of human working memory acts as a form of regularization, shaping our reasoning and potentially preventing cognitive overfitting.
Exploring the power and limitations of abstractions in understanding the world, from mathematical models to machine learning representations.
This blog post introduces the Bernoulli Model, a framework for understanding probabilistic data structures and incorporating uncertainty into data types, particularly Boolean values. It highlights the model's utility in optimizing space and accuracy …
A Boolean algebra framework over trapdoors for cryptographic operations. Introduces a homomorphism from powerset Boolean algebra to n-bit strings via cryptographic hash functions, enabling secure computations with one-way properties.
Building a DIY home lab from spare parts for local LLM experimentation with Proxmox, covering hardware choices, virtualization setup, and lessons learned.
Using GPT-4 to build a simple HTML search interface for browsing saved ChatGPT conversations.
A collection of graduate problem set solutions in computational statistics, numerical methods, and algorithm design from my mathematics master's program.
Numerical approaches to solving maximum likelihood estimation problems.
Reverse-mode automatic differentiation powers modern machine learning. Understanding how it works demystifies PyTorch, JAX, and TensorFlow—it's just the chain rule applied systematically.
Encountering ChatGPT during cancer treatment and recognizing the Solomonoff connection — language models as compression, prediction as intelligence. A personal inflection point reconnecting with AI research after years in survival mode.
A C++ library for composable hash functions using algebraic structure over XOR, with template metaprogramming.
A generic R framework for composable likelihood models as first-class objects, designed for seamless maximum likelihood estimation.
The mathematics of Weibull distributions for modeling time-to-failure in both reliability engineering and cancer survival analysis.
The art of numerical differentiation lies in choosing step size h wisely—small enough that the approximation is good, but not so small that floating-point errors dominate.
Survey of asymmetric multi-core architectures for accelerating critical section execution, addressing the serial bottleneck in Amdahl's law.
An R package providing a unified API for hypothesis testing, so every test returns the same consistent interface.
A review of SAX (Symbolic Aggregate approXimation), a method for converting real-valued time series into symbolic representations with guaranteed distance lower bounds.
Generalizing Peterson's mutual exclusion algorithm to N processors using a tournament tree structure, with a Java implementation.
Dual numbers extend our number system with an infinitesimal epsilon where epsilon^2 = 0. Evaluating f(x + epsilon) yields f(x) + epsilon * f'(x)—the derivative emerges automatically from the algebra.
Bootstrap resampling methods as the intersection of rigorous statistical theory and brute-force computation for approximating sampling distributions.
An R package that lets you specify hazard functions directly instead of choosing from a catalog of named distributions.
Exploring rank-ordered search over encrypted documents using oblivious entropy maps, enabling relevance scoring without revealing document contents.
An R package treating MLEs as first-class algebraic objects with composable statistical properties.
Philosophical reflections on suffering as a computational property of consciousness, and what that implies about the nature of reality.
elementa is a pedagogical linear algebra library where every design decision prioritizes clarity over cleverness—code that reads like a textbook that happens to compile.
An R package for treating probability distributions as first-class algebraic objects that compose through standard operations.
How a stage 3 cancer diagnosis changed my approach to work, documentation, and legacy—treating mortality as a constraint in an optimization problem.
The same GCD algorithm works for integers and polynomials because both are Euclidean domains. This profound insight shows how algebraic structure determines algorithmic applicability.
Building R packages for statistical inference, leveraging R's domain-specific strengths for computational statistics and literate programming.
Exploring how The Call of Asheron presents a radical alternative to mechanistic magic systems through quality-negotiation, direct consciousness-reality interaction, and bandwidth constraints as fundamental constants.
How The Call of Asheron uses four archetypal consciousness-types to explore the limits of any single perspective and the necessity of cognitive diversity for perceiving reality.
Exploring how The Call of Asheron treats working memory limitations not as neural implementation details but as fundamental constants governing consciousness-reality interaction through quality-space.
A fantasy novel where magic is computational discovery—natural philosophy applied to reality's underlying substrate.
Rational numbers give us exact arithmetic where floating-point fails. The implementation reveals deep connections to GCD, the Stern-Brocot tree, and the algebraic structure of fields.
Why I chose to pursue a second master's in Mathematics and Statistics after my CS degree—seeking deeper foundations for statistical theory and inference.
How iterators reduce the N×M algorithm-container problem to N+M by interposing an abstraction layer, following Stepanov's generic programming approach.
The Miller-Rabin primality test demonstrates how probabilistic algorithms can achieve arbitrary certainty, trading absolute truth for practical efficiency.
Introduction to reliability analysis with censored data, where observations are incomplete but statistically informative.
Integers modulo N form a ring—an algebraic structure that determines which algorithms apply. Understanding this structure unlocks algorithms from cryptography to competitive programming.
How API design encodes philosophical values—mutability, explicitness, error handling—shaping how developers think about problems.
The Russian peasant algorithm teaches us that one algorithm can compute products, powers, Fibonacci numbers, and more—once we see the underlying algebraic structure.
Why open source software is essential for reproducible science, and how code serves as a scientific artifact alongside papers and data.
Reflections on mathematical beauty—generality, inevitability, compression, and surprise—and why abstraction matters for software design.
Published IEEE paper on using bootstrap methods to estimate encrypted search confidentiality against frequency attacks.
Applying Unix design principles—do one thing well, compose freely—to library APIs and software architecture.
An exploration of Bloom filters as elegant probabilistic data structures that trade perfect recall for extraordinary space efficiency.
A philosophical essay arguing that moral responsibility may not require free will, and that the question itself may be misframed.