Reinforcement Learning

Browse posts by tag

Reinforcement Learning: An Introduction

Notes

Mathematical RL fundamentals (MDPs, value functions, dynamic programming, approximate methods). RL foundational text that bridges theory and practice.

The Policy: Q-Learning vs Policy Learning

SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying — you can read its value function, but what you read is chilling.

AI Fiction

The Policy

The Policy

A speculative fiction novel exploring AI alignment, existential risk, and the fundamental tension between optimization and ethics. When a research team develops SIGMA, an advanced AI system designed to optimize human welfare, they must confront an …

The AI Course: Everything is Utility Maximization

Intelligence as utility maximization under uncertainty — a unifying framework connecting A* search, reinforcement learning, Bayesian networks, and MDPs. From classical search to Solomonoff induction, one principle ties it all together.