AI / AlphaZero
๐ AlphaZero
Self-play reinforcement learning for board games. Studied and implemented concepts from DeepMind's AlphaZero.
AlphaZero is DeepMind's landmark algorithm that mastered Chess, Shogi, and Go using pure self-play reinforcement learning โ no human knowledge beyond the rules. I studied and implemented AlphaZero-style agents as part of my AI coursework. The core idea combines Monte Carlo Tree Search (MCTS) with a deep neural network that outputs both a policy (move probabilities) and a value (position evaluation). The agent improves through self-play, generating training data that refines the neural network iteratively. This represents a fascinating intersection of deep learning and classical search.