You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The feed-forward architectures fail to achieve good performance in all cases. For the recurrent architectures, the cases where history length is no less than 15 achieve similar performance, and significantly outperform history length of 1.
The recurrent architectures outperform the feed-forward architectures in all cases.
5.3 Problem 3
The policy performs similarly on training goals and testing goals for the grain size of 1 and 5. The algorithm overfits when I increase the grain size to 10, presumably because the testing distribution deviates a lot from the training distribution.
The performance on training goals is improved significantly when the grain size is 10, presumably because the decrease in the number of possible directions in the training goals makes it easier for the algorithm to learn a good policy. However, the testing performance is much worse since the directions in the testing goals differ a lot from those in the training goals.