June 25, 2024
Check out the (early) project and source code on GitHub.
Abstract:
This paper introduces a methodology for generating high-quality, diverse training data for Language Models (LMs) in complex problem-solving domains. Our approach, termed …
January 18, 2026
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
January 15, 2026
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
December 19, 2025
An exploration of why the simplest forms of learning may be incomputable, and what that means for the intelligence we can build.
December 17, 2025
December 17, 2025
December 17, 2025
December 17, 2025
Notes
Modern graduate ML text with causal inference, decision making, and ML foundations. Accessible free textbook with strong conceptual framing.
December 17, 2025
December 17, 2025
December 3, 2025
December 1, 2025
November 4, 2025
SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying — you can read its value function, but what you read is chilling.
October 8, 2025
A logic programming system that alternates between wake and sleep phases—using LLMs for knowledge generation during wake, and compression-based learning during sleep.
October 7, 2025
October 7, 2025
October 7, 2025
October 7, 2025
A novel approach that learns fuzzy membership functions and inference rules automatically through gradient descent on soft circuits.
January 15, 2025
Three approaches to computing derivatives—forward-mode AD, reverse-mode AD, and finite differences—each with different trade-offs. Understanding when to use each is essential for numerical computing and machine learning.
January 5, 2025
Science is search through hypothesis space. Intelligence prunes; testing provides signal. Synthetic worlds could accelerate the loop.
December 1, 2024
Applying Monte Carlo Tree Search to large language model reasoning with a rigorous formal specification.
November 15, 2024
Using GMM clustering to improve retrieval in topically diverse knowledge bases
October 15, 2024
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.
September 30, 2024
Solomonoff induction, MDL, speed priors, and neural networks are all special cases of one Bayesian framework with four knobs.
April 20, 2024
Using Fisher information and information geometry for optimization problems.
March 15, 2024
A minimal implementation of automatic differentiation for educational purposes.
March 12, 2024
Intelligence as utility maximization under uncertainty — a unifying framework connecting A* search, reinforcement learning, Bayesian networks, and MDPs. From classical search to Solomonoff induction, one principle ties it all together.
June 17, 2023
Exploring how the limited capacity of human working memory acts as a form of regularization, shaping our reasoning and potentially preventing cognitive overfitting.
January 17, 2023
Reverse-mode automatic differentiation powers modern machine learning. Understanding how it works demystifies PyTorch, JAX, and TensorFlow—it's just the chain rule applied systematically.
December 8, 2022
Encountering ChatGPT during cancer treatment and recognizing the Solomonoff connection — language models as compression, prediction as intelligence. A personal inflection point reconnecting with AI research after years in survival mode.
September 20, 2021
Dual numbers extend our number system with an infinitesimal epsilon where epsilon^2 = 0. Evaluating f(x + epsilon) yields f(x) + epsilon * f'(x)—the derivative emerges automatically from the algebra.
February 1, 2020