Name		Name	Last commit message	Last commit date
parent directory ..
data		data
ref		ref
results		results
README.md		README.md
hw5c_instructions.pdf		hw5c_instructions.pdf
logz.py		logz.py
plot.py		plot.py
plot_3.py		plot_3.py
point_mass.py		point_mass.py
point_mass_observed.py		point_mass_observed.py
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
run_1.sh		run_1.sh
run_21.sh		run_21.sh
run_22.sh		run_22.sh
run_23.sh		run_23.sh
run_24.sh		run_24.sh
run_25.sh		run_25.sh
run_31.sh		run_31.sh
run_32.sh		run_32.sh
train_policy.py		train_policy.py

README.md

CS294-112 HW 5c: Meta-Learning

Usage

To run all experiments and plot figures for the report, run

bash run_1.sh
bash run_21.sh
bash run_22.sh
bash run_23.sh
bash run_24.sh
bash run_25.sh
bash run_31.sh
bash run_32.sh

Results

5.1 Problem 1

5.2 Problem 2

The feed-forward architectures fail to achieve good performance in all cases. For the recurrent architectures, the cases where history length is no less than 15 achieve similar performance, and significantly outperform history length of 1.

The recurrent architectures outperform the feed-forward architectures in all cases.

5.3 Problem 3

The policy performs similarly on training goals and testing goals for the grain size of 1 and 5. The algorithm overfits when I increase the grain size to 10, presumably because the testing distribution deviates a lot from the training distribution.

The performance on training goals is improved significantly when the grain size is 10, presumably because the decrease in the number of possible directions in the training goals makes it easier for the algorithm to learn a good policy. However, the testing performance is much worse since the directions in the testing goals differ a lot from those in the training goals.

Original README

Dependencies:

Python 3.5
Numpy version 1.14.5
TensorFlow version 1.10.5
MuJoCo version 1.50 and mujoco-py 1.50.1.56
OpenAI Gym version 0.10.5
seaborn
Box2D==2.3.2

See the HW5c PDF for further instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meta

meta

README.md

CS294-112 HW 5c: Meta-Learning

Usage

Results

5.1 Problem 1

5.2 Problem 2

5.3 Problem 3

Original README

Files

meta

Directory actions

More options

Directory actions

More options

Latest commit

History

meta

Folders and files

parent directory

README.md

CS294-112 HW 5c: Meta-Learning

Usage

Results

5.1 Problem 1

5.2 Problem 2

5.3 Problem 3

Original README