AI / AlphaZero

🏆 AlphaZero

Self-play reinforcement learning for board games. Studied and implemented concepts from DeepMind's AlphaZero.

AlphaZero is DeepMind's landmark algorithm that mastered Chess, Shogi, and Go using pure self-play reinforcement learning — no human knowledge beyond the rules. I studied and implemented AlphaZero-style agents as part of my AI coursework. The core idea combines Monte Carlo Tree Search (MCTS) with a deep neural network that outputs both a policy (move probabilities) and a value (position evaluation). The agent improves through self-play, generating training data that refines the neural network iteratively. This represents a fascinating intersection of deep learning and classical search.