Using Q-learning to play Super Mario Bros for the Nes.

Both neural networks and replay buffers are based on tutorials from Adventures in Machine Learning

Requirements

gym: pip install gym

nes-py: pip install nes-py

Super Mario:pip install gym-super-mario-bros

numpy: pip install numpy

tensorflow: pip install tensorflow visual studio: For windows install the “Desktop development with C++” workload

Training the Agent

python main.py

Arg Default Description
--env SuperMarioBros-1-1-v0 Gym environment and level
--dueling   Change from DDQN to Dueling DQN
--per   Change from stander replay buffer to Performance Experience Replay
--frame_size (84, 84) Change the Resolution to decrease computational load
--max_timesteps 5e5 Total timesteps the environment will play
--delay_timesteps 1e4 Initial timesteps to fill the replay buffer
--min_epsilon 0.01 The lowest value for epsilon
--epsilon_decay 45e4 Timesteps required to reach the minimum epsilon
--beta_decay 45e4 Timesteps required to reach the maximum beta
--learning_rate 1e-3 Learning rate for the optimizer
--gamma 0.99 The discount factor for Q(s’,a’)
--tau 0.05 The rate for updating the target network
--memory_size 131072 The size of the memory buffer. PER requires 2^x’)
--batch_size 64 The size of the mini-batches for Q learning
--debug_summary   Save tensorboard summaries of training
--save_model   Store the models at set intervals
--save_freq 5e4 The intervals when the models’ weights are saved

For Example, if you wanted to train a DDQN Model with a PER memory buffer using mini-batches of 32 samples:

python main.py --per --batch_size 32