Reverse-Process Synthetic Data Generation for Math Reasoning
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
Browse posts by tag
Training LLMs on mathematical reasoning by inverting easy-to-solve problems: generate derivatives, reverse them into integration exercises with full step-by-step solutions.
Step-by-step reasoning via prompting. Unlocked a new capability class.
Interleaving reasoning traces and actions. The prompting pattern behind most LLM agents.
The most dramatic possibility in AI might arrive through the most mundane mechanism. Not a beam of sacred light. A sufficiently good build system.
What if reasoning traces could learn their own usefulness? A simple RL framing for trace memory, and why one reward signal is enough.
Applying Monte Carlo Tree Search to large language model reasoning, with a formal specification of the algorithm.
What if LLMs could remember their own successful reasoning? A simple experiment in trace retrieval, and why 'latent' is the right word.