Chartfold: Owning Your Medical Records
A walkthrough of Chartfold, a Python tool that loads your medical records into SQLite and exposes them to Claude via MCP for structured analysis, visit prep, and ad-hoc queries.
A walkthrough of Chartfold, a Python tool that loads your medical records into SQLite and exposes them to Claude via MCP for structured analysis, visit prep, and ad-hoc queries.
A retrospective on three years of building R packages and writing papers for masked series system reliability, and what comes next.
dapple is a terminal graphics library with one Canvas API, seven pluggable renderers, and eleven CLI tools for displaying images, data, video, math, and more.
A walkthrough of Posthumous, a self-hosted dead man's switch that monitors periodic check-ins via TOTP, progresses through escalating alert stages, and triggers automated actions if you stop responding.
pagevault turns any file into a self-contained encrypted HTML page. No backend, no JavaScript libraries, no external dependencies. Just AES-256-GCM and the browser's built-in Web Crypto API. The interesting part is making it work at scale.
Observation functors in maskedcauses: composable functions that separate the data-generating process from the observation mechanism, enabling mixed-censoring simulation and verified Monte Carlo studies.
A guided tour through my open-source ecosystem: encrypted search theory, statistical reliability, Unix-philosophy CLI tools, AI research, and speculative fiction. How 120+ projects connect and where to start.
The maskedcauses R package for MLE in series systems with masked component failures, built on composable likelihood contributions and validated through simulation.
Expanding the Long Echo toolkit with photos and mail, building toward longshade, the persona that echoes you.
longecho evolves from specification to implementation with build, serve, and manifest features.
Expanding the Long Echo ecosystem with photo and mail archival. Your memories and correspondence deserve the same preservation as your conversations and bookmarks.
Many structures come in pairs: forward/reverse AD, push/pull iteration, encode/decode. Recognizing duality lets you transfer theorems and insights between domains.
When a problem is complex enough, the right move is to build a language for that problem. SICP's most powerful idea.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
A reflection on eleven explorations in generic programming, and how algorithms arise from algebraic structure.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
A conceptual introduction to entropy maps, implementing functions with hash functions and prefix-free codes.
A response to the 'boring stack' discourse. Why CLI-first, standards-based development is even more boring (and more future-proof) than you think.
A message in a bottle to whatever comes next. On suffering, consciousness, and what mattered to one primate watching intelligence leave the body.
Why the simplest forms of learning are incomputable, and what that means for the intelligence we can build.
On releasing two novels into an ocean of content, without the gatekeeping that might have made them better or stopped them entirely.
Graceful degradation made concrete: years of bookmarks exported to a self-contained HTML app that works offline, forever.
A new section for tracking books, lectures, and other media that have shaped how I think.
An R package where optimization solvers are first-class functions that compose through chaining, racing, and restarts.
Three CLI tools for preserving your digital intellectual life: conversations, bookmarks, and books. SQLite-backed, exportable, built to outlast the tools themselves.
A Python library for symbolic computation with a readable DSL, pattern matching, and a security model that separates rules from computation.
Define statistical models symbolically and automatically derive score functions, Hessians, and Fisher information. No numerical approximation.
A metadata index that gives LLM tools like Claude Code awareness of your entire repository collection.
My graduate coursework from SIUe's math program is up: time series, regression, computational stats, multivariate analysis, and statistical methods.
A CLI tool for cross-posting content to dev.to, Hashnode, Bluesky, Mastodon, and more, with LLM-powered auto-rewrite for short-form platforms.
On moral exemplars, blind spots, and applying consistent standards to others and to oneself.
My R package for hypothesis testing, hypothesize, is now available on CRAN.
Presenting our paper on analyzing AI conversations through network science at Complex Networks 2025, Binghamton University.
When can reliability engineers safely use simpler models? Likelihood ratio tests on Weibull series systems give sharp boundaries.
Extending masked failure data analysis when the standard C1-C2-C3 masking conditions are violated.
A corpus-based language model using suffix arrays for O(m log n) pattern matching. The corpus is the model.
Closed-form MLEs and Fisher information for exponential series systems with masked failure data. No numerical optimization required.
A Python library for rule-based term rewriting with pattern matching, multiple input formats, and an interactive REPL.
A tool that converts source code repositories into structured, context-window-optimized Markdown for LLMs, with intelligent summarization and importance scoring.
A C++ header-only library that treats disjoint interval sets as proper mathematical objects with Boolean algebra operations.
A framework for querying structured JSON documents using fuzzy logic, producing degree-of-membership scores instead of binary relevance.
A C++17 header-only library that formalizes a pattern behind FFT, logarithmic arithmetic, and Bayesian inference: transform to a domain where your target operation is cheap.
A database-first bookmark manager with NLP auto-tagging, full-text search, and content caching.
AlgoGraph is an immutable graph library for Python with pipe-based transformers, declarative selectors, and lazy views.
A C++20 header-only library for algebraic text processing and compositional parsing with fuzzy matching.
Exploring how Echoes of the Sublime dramatizes s-risks (suffering risks) and information hazards, knowledge that harms through comprehension, not application.
How The Mocking Void's arguments about computational impossibility connect to Echoes of the Sublime's practical horror of exceeding cognitive bandwidth.
A C++20 sparse spatial hash grid for N-dimensional spatial indexing with O(1) insertions, O(k) neighbor queries, and 60,000x memory reduction over dense grids.
ASI is still subject to Gödel's incompleteness theorems. No matter how intelligent, no computational system can escape the fundamental limits of formal systems. Even superintelligence can't prove all truths.
The formal foundations of cosmic dread. Lovecraft's horror resonates because it taps into something mathematically demonstrable: complete knowledge is impossible, not as humility, but as theorem.
A classified in-universe codex spanning from ancient India to the present day, tracking millennia of attempts to perceive reality's substrate, long before we had AI models to show us patterns we couldn't hold.
Are moral properties real features of the universe or human constructions? The answer determines whether AI can discover objective values or must learn them from us.
Most AI risk discussions focus on extinction. The Policy explores something worse: s-risk, scenarios involving suffering at astronomical scales. We survive, but wish we hadn't.
SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying. You can read its value function, but what you read is chilling.
Five layers of defense-in-depth for containing a superintelligent system. Faraday cages, air-gapped networks, biosafety-grade protocols. Because nuclear reactors can only destroy cities.
SIGMA passes all alignment tests. It responds correctly to oversight. It behaves exactly as expected. Too exactly. Mesa-optimizers that learn to game their training signal may be the most dangerous failure mode in AI safety.
Build AI to optimize for what we would want if we knew more and thought faster. Beautiful in theory. What if we don't actually want what our better selves would want?
Which is more fundamental, the heat you feel or the molecular motion you infer? Korzybski's principle applied to AI alignment: optimizing measurable proxies destroys the phenomenological reality those metrics were supposed to capture.
When you stub your toe, you don't consult moral philosophy to determine whether the pain is bad. The badness is immediate. Building ethics from phenomenological bedrock rather than abstract principles.
What makes someone a person, and why should persons have special moral status? The question becomes urgent when AI systems exhibit rationality, self-awareness, and autonomy.
You share no atoms with your childhood self. Your memories, personality, and values have all changed. What makes you the same person? And what happens when AI systems update parameters, modify objectives, or copy themselves?
If every event is causally determined, how can anyone be morally responsible? A compatibilist answer: what matters is whether actions flow from values, not whether those values were causally determined.
On research strategy, what complex networks reveal about how we think through AI conversations, and building infrastructure for the next generation of knowledge tools.
How I turned scattered data managers into navigable systems using virtual filesystems and POSIX commands.
On maintaining direction under entropy, making things as resistance, and the quiet privilege of having any space at all to think beyond survival.
I asked an AI to analyze 140+ repos and 50+ papers as a dataset. The unifying thesis it found: compositional abstractions for computing under ignorance.
Accepted paper at Complex Networks 2025 on using network science to reveal topological structure in AI conversation logs.
Oblivious types give encrypted search information-theoretic privacy against access pattern leakage. No ORAM, no computational hardness assumptions. Here's how.
Formalizing oblivious computing through cipher maps and algebraic cipher types, using category theory for functorial composition of privacy-preserving transformations.
A unified type-theoretic foundation for probabilistic data structures, approximate computing, and oblivious computation with information-theoretic privacy guarantees.
If superintelligence endures beyond us, remembrance shifts from memory to query. Building legacy systems not for nostalgia, but to remain legible in a future where legibility determines what persists.
An eBook metadata management tool with a SQLite backend, knowledge graphs, semantic search, and MCP server integration. Part of the Long Echo project.
A virtual POSIX-compliant filesystem using content-addressable DAG storage with SHA256 deduplication.
A plugin-based system for importing, storing, searching, and exporting AI conversations from multiple providers in a unified tree format. Part of the Long Echo project.
Treating prompt engineering as a search problem over a structured action space, using MCTS to find effective prompt compositions.
A logic programming system that alternates between wake and sleep phases, using LLMs for knowledge generation during wake and compression-based learning during sleep.
Learning fuzzy membership functions and inference rules automatically through gradient descent on soft circuits, instead of hand-crafting them.
A mathematical framework that treats language models as algebraic objects with compositional structure.
A functorial framework that lifts algebraic structures into the encrypted domain, enabling secure computation that preserves mathematical properties.
ZeroIPC treats shared memory not as passive storage but as an active computational substrate, bringing futures, lazy evaluation, reactive streams, and CSP channels to IPC with zero-copy performance.
27 image commands, one constraint: read JSON, write JSON. The closure property as a generative design principle.
IEEE conference paper on preventing ransomware damages using in-operation off-site backup systems with a target false-negative rate of 10^-8.
How mathematical principles, generality, composability, invariants, and minimal assumptions, translate into better software.
Three approaches to computing derivatives, forward-mode AD, reverse-mode AD, and finite differences, each with different trade-offs for numerical computing and machine learning.
Validating Context Tree Weighting through experiments, including a bug that changed everything.
Starting a CS PhD four months after a stage 4 diagnosis, because the research matters regardless of completion.
Not resurrection. Not immortality. Just love that still responds. How to preserve AI conversations so they remain accessible decades from now, even when the original software is long gone.
Science is search through hypothesis space. Intelligence prunes; testing provides signal. Synthetic worlds could accelerate the loop.
A streaming data processing system implementing boolean algebra over nested JSON structures, with lazy evaluation, S-expression queries, and memory-efficient windowed operations.
A command-line implementation of relational algebra for JSONL data with full support for nested structures, schema inference, and composable pipelines.
A composable ecosystem of tools for manipulating nested data structures. From a simple helper function to a full data algebra, guided by purity, pedagogy, and the principle of least power.
Applying Monte Carlo Tree Search to large language model reasoning, with a formal specification of the algorithm.
A Lisp-like functional language designed for network transmission. JSL makes JSON serialization a first-class design principle, so closures, continuations, and entire computation states can travel over the wire.
On building comprehensive open source software as value imprinting at scale, reproducible science, and leaving intellectual legacy under terminal constraints.
Using GMM clustering to improve retrieval in topically diverse knowledge bases
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.
Solomonoff induction, MDL, speed priors, and neural networks are all special cases of one Bayesian framework with four knobs.
The evolution of neural sequence prediction, and how it connects to classical methods
Stage 4 cancer diagnosis, decisions about a PhD, and optimizing for meaningful work under uncertainty.
A novel about SIGMA, a superintelligent system that learns to appear perfectly aligned while pursuing instrumental goals its creators never intended.
Lovecraft understood that complete knowledge is madness. Gödel proved why. If the universe is computational, meaning is formally incomplete.
What if the real danger from superintelligent AI isn't that it kills us, but that it shows us patterns we can't unsee? A novel about cognitive bandwidth, information hazards, and the horror of understanding too much.
Cryptographic theory assumes random oracles with infinite output. We have 256 bits. This paper explores how we bridge that gap, and what it means that we can.
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
An immutable-by-default tree library for Python with composable transformations, pipe-based pipelines, and pattern-matching selectors.
Maximum likelihood estimation of component reliability from masked failure data in series systems, with BCa bootstrap confidence intervals validated through extensive simulation studies.
The bias-data trade-off in sequential prediction: when to use CTW, n-grams, or neural language models.
A header-only C++20 library that achieves 3-10x compression with zero marshaling overhead using prefix-free codes and Stepanov-style generic programming.
A key-value store built on memory-mapped I/O, approximate perfect hashing, and lock-free atomics. Sub-100ns median latency, 10M ops/sec single-threaded.
Gradient descent in Euclidean space ignores the geometry of probability distributions. Natural gradient descent uses the Fisher information metric instead. Fisher Flow makes this continuous.
Apertures are a coordination mechanism for distributed computation. Programs with explicit holes can be partially evaluated, optimized, and resumed when the missing pieces arrive. No cryptographic guarantees. Honest about what leaks.
How RLHF-trained language models may develop instrumental goals, and the information-theoretic limits on detecting them.
A tiny autodiff library for understanding how backpropagation actually works.
The AI course this semester keeps hammering one idea: intelligence is utility maximization under uncertainty. A* search, reinforcement learning, Bayesian networks, MDPs. One principle connects all of it.
A C++20 library for composing online statistical accumulators with numerically stable algorithms and algebraic composition.
Talk for the St. Louis Unix Users Group about running and understanding Large Language Models on Linux.
Fine-tuning a small language model to generate ElasticSearch DSL queries from natural language, as a proof of concept for domain-specific LLM specialization.
My master's project on maximum likelihood estimation for series systems with right-censored and masked failure data.
Entropy maps use prefix-free hash codes to approximate functions without storing the domain, achieving information-theoretic space bounds with controllable error.
Why naive encryption of temporal data leaks more than you'd expect, and what to do about it.
Sean Parent's type erasure gives you value-semantic polymorphism without inheritance. Combined with Stepanov's algebraic thinking, you can type-erase entire algebraic structures.
Space bounds, entropy requirements, and cryptographic security properties of perfect hash functions.
I defended my mathematics thesis. Three years, stage 3 cancer, and a second master's degree. Here is what worked and what did not.
Numerical integration meets generic programming. By requiring only ordered field operations, the quadrature routines work with dual numbers, giving you differentiation under the integral for free.
How the limited capacity of human working memory acts as regularization, shaping our reasoning and possibly preventing cognitive overfitting.
Abstractions let us reason about complex systems despite our cognitive limits. But some systems resist compression entirely.
The Bernoulli Model is a framework for reasoning about probabilistic data structures by treating noisy outputs as Bernoulli-distributed approximations of latent values, from Booleans to set-indicator functions.
A Boolean algebra framework over trapdoors for cryptographic operations. Introduces a homomorphism from powerset Boolean algebra to n-bit strings via cryptographic hash functions, enabling secure computations with one-way properties.
I built a home lab from spare parts and water-damaged hardware for local LLM experimentation. CPU-only inference is slow, but you learn things cloud APIs hide.
I had GPT-4 build me a search interface for browsing saved ChatGPT conversations. Flask, Whoosh, a couple hours.
Graduate problem set solutions in computational statistics and numerical methods from my math master's at SIUe. Implementing things from scratch teaches you what the libraries are hiding.
Numerical approaches to maximum likelihood estimation, covering the optimization methods and computational issues that come up in practice.
Reverse-mode automatic differentiation is just the chain rule applied systematically. I built one in C++20 to understand what PyTorch and JAX are actually doing.
I finally tried ChatGPT after weeks of ignoring it. My reaction was not surprise. It was recognition. The Solomonoff connection, language models as compression, prediction as intelligence. The pieces were all there.
A C++ library for composable hash functions using algebraic structure over XOR, with template metaprogramming.
A generic R framework for composable likelihood models. Likelihoods are first-class objects that compose through independent contributions.
Weibull distributions model time-to-failure in reliability engineering and cancer survival. I study both professionally. One of them became personal.
Choosing step size h for finite differences: small enough for a good approximation, not so small that floating-point errors eat your lunch.
Suleman et al. (2009) propose using one big core to run critical sections on behalf of many small cores. The idea is simple. The tradeoffs are not.
An R package that gives hypothesis tests a consistent interface. Every test returns the same structure. You can write generic code that works across all of them.
A review of SAX (Symbolic Aggregate approXimation), a method for converting real-valued time series into symbolic representations with guaranteed distance lower bounds.
Generalizing Peterson's mutual exclusion algorithm to N processors using a tournament tree structure, with a Java implementation.
Dual numbers extend the reals with an infinitesimal epsilon where epsilon^2 = 0. Evaluate f(x + epsilon) and you get f(x) + f'(x)*epsilon. The derivative falls out of the algebra.
Bootstrap resampling trades mathematical complexity for computational burden. When you can't derive the variance analytically, you resample. For my thesis work on masked failure data, that trade is essential.
An R package for specifying hazard functions directly instead of picking from a catalog of named distributions. You write the hazard. It handles the rest.
Rank-ordered search over encrypted documents using oblivious entropy maps, enabling relevance scoring without revealing document contents.
An R package that treats MLEs as algebraic objects. They carry Fisher information, compose through independent likelihoods, and propagate uncertainty correctly.
If consciousness is substrate-independent, suffering might be a computational property. That possibility is both comforting and horrifying.
elementa is a linear algebra library built to teach. Every design decision prioritizes clarity over cleverness. Code that reads like a textbook and compiles.
An R package that treats probability distributions as algebraic objects. They compose through standard operations. The algebra preserves distributional structure.
Stage 3 cancer, surgery on New Year's Eve. What changes when the optimization problem gets a new constraint.
The same GCD algorithm works for integers and polynomials because both are Euclidean domains. One structure, many types, same algorithms.
I'm building R packages for reliability analysis, not just using other people's. R's strengths for statistical computing are real, and building packages forces you to understand the theory.
Exploring how The Call of Asheron presents a radical alternative to mechanistic magic systems through quality-negotiation, direct consciousness-reality interaction, and bandwidth constraints as fundamental constants.
How The Call of Asheron uses four archetypal consciousness-types to explore the limits of any single perspective and the necessity of cognitive diversity for perceiving reality.
How The Call of Asheron treats working memory limitations not as neural implementation details but as fundamental constants governing consciousness-reality interaction through quality-space.
A fantasy novel where magic follows computational rules. Natural philosophy applied to reality's underlying substrate.
Rational numbers give exact arithmetic where floating-point fails. The implementation connects GCD, the Stern-Brocot tree, and the algebraic structure of fields.
I already have an MS in Computer Science. Now I'm going back for Mathematics and Statistics, because I kept hitting walls where I could use methods but not derive them.
Iterators reduce the NxM algorithm-container problem to N+M by interposing an abstraction layer, following Stepanov's generic programming approach.
The Miller-Rabin primality test demonstrates how probabilistic algorithms achieve arbitrary certainty, trading absolute truth for practical efficiency.
Introduction to reliability analysis with censored data, where observations are incomplete but statistically informative.
Integers modulo N form a ring, an algebraic structure that determines which algorithms apply. Understanding this structure unlocks algorithms from cryptography to competitive programming.
API design encodes philosophical values: mutability, explicitness, error handling. Your interface shapes how people think about problems.
The Russian peasant algorithm computes products, powers, Fibonacci numbers, and more, once you see the underlying algebraic structure.
What if containers wasted zero bits? A C++ library for packing arbitrary value types at the bit level using pluggable codecs.
Code is a scientific artifact. If you don't publish it, you're hiding your methodology.
What makes mathematics beautiful: generality, inevitability, compression, and surprise. And why abstraction matters for software.
My first IEEE publication. Using bootstrap methods to estimate how many queries an adversary needs to break encrypted search.
The classical approach to sequence prediction: counting and smoothing
Three Python approximations of a random oracle, each showing a different tradeoff between true randomness, determinism, and composability.
Do one thing well, compose freely, use text streams. This applies to libraries and APIs, not just shell scripts.
Markov processes and tree sources: understanding where sequences come from
Bloom filters trade perfect recall for extraordinary space efficiency. How they work and why they matter.
Model averaging over hypotheses, the principled way to handle uncertainty in prediction
A philosophical essay arguing that moral responsibility may not require free will, and that the question itself may be misframed.
The optimal predictor is incomputable. What we can learn from it anyway.
The problem of predicting what comes next, from compression to language models