Deep CodeCraft

Hacky research code that trains policies for the CodeCraft real-time strategy game with proximal policy optimization.

Blog post: Mastering Real-Time Strategy Games with Deep Reinforcement Learning: Mere Mortal Edition

Requirements

Python >= 3.7, pip
CodeCraft Server

Setup

Install dependencies with

pip install -r requirements.txt
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.6.0+${CUDA}.html

where ${CUDA} should be replaced by either cpu, cu92, cu101 or cu102 depending on your PyTorch installation.

If you want the training code to record metrics to Weights & Biases, run wandb login.

Usage

The first step is to setup and run CodeCraft Server.

Training

To train a policy with the default set of hyperparameters, run:

EVAL_MODELS_PATH=/path/to/golden-models python main.py --hpset=standard --out-dir=${OUT_DIR}`

Logs and model checkpoints will be written to the ${OUT_DIR} directory. If you want policies to be evaluted against a set of fixed opponents during training, download the required checkpoints available here to the right subfolder in the folder specified by EVAL_MODEL_PATH. For evaluations with the standard config, you need standard/curious-galaxy-40M.pt and standard/graceful-frog-100M.pt. To disable evaluation of the policy during training, set --eval_envs=0. To see additional options, run python main.py --help and consult hyperparams.py.

Showmatch

To run games with already trained policies, run:

python showmatch.py /path/to/policy1.pt /path/to/policy2.pt --task=STANDARD --num_envs=64

You can then watch the games at http://localhost:9000/observe?autorestart=true&autozoom=true.

Job Runner

The job runner allows you to schedule and execute many runs in parallel. The command

python runner.py --jobfile-dir=${JOB_DIR} --out-dir=${OUT_DIR} --concurrency=${CONCURRENCY}

starts a job runner that watches the ${JOB_DIR} directory for new jobs, writes results to folders created in ${OUT_DIR} and will run up to ${CONCURRENCY} experiments in parallel.

You can then schedule jobs with

python schedule.py --repo-path=https://github.com/cswinter/DeepCodeCraft.git --queue-dir=${JOB_DIR} --params-file=params.yaml

where params.yaml is a file that specifies the set of hyperparameters to use, for example:

- hpset: standard
  adr_variety: [0.5, 0.3]
  lr: [0.001, 0.0003]
- hpset: standard
  repeat: 4
  steps: 300e6

The repeat parameter tells the job runner to spawn multiple runs. When a hyperparameter is set to a list of different values, one experiment is spawned for each combination. So above params.yaml will spawn a total of 8 experiment runs, 4 of which will run for 300 million samples with the default set of hyperparameters, and one additional run for all 4 combinations of the adr_variety and lr hyperparameters.

The ${JOB_DIR} may be on a remote machine that you can access via ssh/rsync, e.g. --queue-dir=192.168.0.101:/home/clemens/xprun/queue.

Citation

@misc{DeepCodeCraft2020,
  author = {Winter, Clemens},
  title = {Deep CodeCraft},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/cswinter/DeepCodeCraft}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 491 Commits
ablations		ablations
gym_codecraft		gym_codecraft
.gitignore		.gitignore
README.md		README.md
adr.py		adr.py
codecraft.py		codecraft.py
gather.py		gather.py
hyper_params.py		hyper_params.py
list_net.py		list_net.py
main.py		main.py
multihead_attention.py		multihead_attention.py
plot_results.py		plot_results.py
policy_t2.py		policy_t2.py
policy_t3.py		policy_t3.py
policy_t4.py		policy_t4.py
policy_t5.py		policy_t5.py
policy_t6.py		policy_t6.py
policy_t7.py		policy_t7.py
policy_t8.py		policy_t8.py
progress.ipynb		progress.ipynb
requirements.txt		requirements.txt
reset-drivers.sh		reset-drivers.sh
runner.py		runner.py
schedule.py		schedule.py
setup-remote.sh		setup-remote.sh
setup-system.sh		setup-system.sh
showmatch.py		showmatch.py
spatial.py		spatial.py
test_spatial_scatter.py		test_spatial_scatter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep CodeCraft

Requirements

Setup

Usage

Training

Showmatch

Job Runner

Citation

About

Releases

Packages

Languages

cswinter/DeepCodeCraft

Folders and files

Latest commit

History

Repository files navigation

Deep CodeCraft

Requirements

Setup

Usage

Training

Showmatch

Job Runner

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages