Reinforcement Learning

Browse posts by tag

Reinforcement Learning: An Introduction

Notes

Mathematical RL fundamentals (MDPs, value functions, dynamic programming, approximate methods). RL foundational text that bridges theory and practice.

The Policy: Q-Learning vs Policy Learning

SIGMA uses Q-learning rather than direct policy learning. This architectural choice makes it both transparent and terrifying. You can read its value function, but what you read is chilling.

AI Fiction

The Policy

The Policy

A speculative fiction novel exploring AI alignment, existential risk, and the fundamental tension between optimization and ethics. When a research team develops SIGMA, an advanced AI system designed to optimize human welfare, they must confront an …

Everything is Utility Maximization

The AI course this semester keeps hammering one idea: intelligence is utility maximization under uncertainty. A* search, reinforcement learning, Bayesian networks, MDPs. One principle connects all of it.