Articles
- 2022/11/04 Human-level Atari 200x faster
- 2020/12/24 TLeague Framework
- 2020/12/24 Ideas and Mechanisms of AlphaStar System
- 2020/11/09 VIM Macro 初窥
- 2020/09/22 Agent57: Outperforming the Atari Human Benchmark
- 2020/09/20 Multi-Armed Bandit Problem 多臂老虎机问题简介
- 2020/09/16 Retrace: Safe and Efficient Off-Policy Reinforcement Learning
- 2020/09/14 Never Give Up: Learning Directed Exploration Strategies
- 2020/09/10 R2D2:Recurrent Experience Replay in Distributed Reinforcement Learning
- 2020/09/08 Observe and Look Further, Achieving Consistent Performance on Atari
- 2020/09/04 Ape-X: Distributed Prioritized Experience Replay
- 2020/09/01 DRQN:Deep Recurrent Q-Learning for Partially Observable MDPs
- 2020/08/21 Dueling DQN:Dueling Network Architectures for Deep Reinforcement Learning
- 2020/08/20 Noisy Networks for Exploration
- 2020/08/17 Categorical DQN:A Distributional Perspective on Reinforcement Learning
- 2018/06/24 Use GANs to Generate Pokemons (pytorch)
- 2018/02/25 Maximum Entropy Inverse Reinforcement Learning