Advancing Mathematical Reasoning in AI: Introducing Reverse-Process Synthetic Data Generation
A reverse-process approach to synthetic data generation for training LLMs on mathematical reasoning, producing step-by-step solutions from worked examples.
Browse posts by tag
A reverse-process approach to synthetic data generation for training LLMs on mathematical reasoning, producing step-by-step solutions from worked examples.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
The classical AI curriculum teaches rational agents as utility maximizers. The progression from search to RL to LLMs is really about one thing: finding representations that make decision-making tractable.
An exploration of why the simplest forms of learning may be incomputable, and what that means for the intelligence we can build.
Modern graduate ML text with causal inference, decision making, and ML foundations. Accessible free textbook with strong conceptual framing.
SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying — you can read its value function, but what you read is chilling.
A logic programming system that alternates between wake and sleep phases—using LLMs for knowledge generation during wake, and compression-based learning during sleep.
A novel approach that learns fuzzy membership functions and inference rules automatically through gradient descent on soft circuits.
Three approaches to computing derivatives—forward-mode AD, reverse-mode AD, and finite differences—each with different trade-offs. Understanding when to use each is essential for numerical computing and machine learning.
Science is search through hypothesis space. Intelligence prunes; testing provides signal. Synthetic worlds could accelerate the loop.
Applying Monte Carlo Tree Search to large language model reasoning with a rigorous formal specification.
Using GMM clustering to improve retrieval in topically diverse knowledge bases
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.
Solomonoff induction, MDL, speed priors, and neural networks are all special cases of one Bayesian framework with four knobs.
Using Fisher information and information geometry for optimization problems.
A minimal implementation of automatic differentiation for educational purposes.
Intelligence as utility maximization under uncertainty — a unifying framework connecting A* search, reinforcement learning, Bayesian networks, and MDPs. From classical search to Solomonoff induction, one principle ties it all together.
Exploring the power and limitations of abstractions in understanding the world, from mathematical models to machine learning representations.
Exploring how the limited capacity of human working memory acts as a form of regularization, shaping our reasoning and potentially preventing cognitive overfitting.
Reverse-mode automatic differentiation powers modern machine learning. Understanding how it works demystifies PyTorch, JAX, and TensorFlow—it's just the chain rule applied systematically.
Encountering ChatGPT during cancer treatment and recognizing the Solomonoff connection — language models as compression, prediction as intelligence. A personal inflection point reconnecting with AI research after years in survival mode.
Dual numbers extend our number system with an infinitesimal epsilon where epsilon^2 = 0. Evaluating f(x + epsilon) yields f(x) + epsilon * f'(x)—the derivative emerges automatically from the algebra.
The fundamental problem of predicting what comes next—from compression to language models