Skip to content
This repository has been archived by the owner on Feb 13, 2025. It is now read-only.

[BUG] Inconsistent environment seeding #53

Closed
jlin816 opened this issue May 10, 2022 · 2 comments · Fixed by #68
Closed

[BUG] Inconsistent environment seeding #53

jlin816 opened this issue May 10, 2022 · 2 comments · Fixed by #68
Assignees
Labels
bug Something isn't working

Comments

@jlin816
Copy link

jlin816 commented May 10, 2022

🐛 Bug

Seeding doesn't consistently generate the same environment.

To Reproduce

Steps to reproduce the behavior:

  1. Run this snippet repeatedly:
env = gym.make("MiniHack-KeyRoom-Fixed-S5-v0",
    observation_keys=("pixel", "colors", "chars", "glyphs", "tty_chars"),
    seeds=(42, 42, False))
env.seed(42, 42, False)
obs = env.reset()
env.render()
print(env.get_seeds())

Sometimes this prints

Hello Agent, welcome to NetHack!  You are a chaotic male human Rogue.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       ----                                     
                                       |..|                                     
                                       +(.|                                     
                                    ----..|                                     
                                    |.....|                                     
                                    |...@.|                                     
                                    -------                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
Agent the Footpad              St:18/02 Dx:18 Co:13 In:8 Wi:9 Ch:7 Chaotic S:0  
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0                                        
(42, 42, False)

But also occasionally prints (note the printed seeds are (0, 0, False)):

Hello Agent, welcome to NetHack!  You are a chaotic male human Rogue.           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       ----                                     
                                       |@.|                                     
                                       +..|                                     
                                       -..|                                     
                                        ..|                                     
                                        ..|                                     
                                       ----                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
Agent the Footpad              St:14 Dx:18 Co:14 In:11 Wi:11 Ch:8 Chaotic S:0   
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0                                        
(0, 0, False)

Expected behavior

Same positions of agent/key, and same seeds being printed by env.get_seeds()

Environment


MiniHack version: 0.1.3+57ca418
NLE version: 0.8.1
Gym version: 0.21.0
PyTorch version: 1.11.0+cu102
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 20.04.3 LTS
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CMake version: version 3.23.1

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 3080
GPU 1: NVIDIA GeForce RTX 3080

Nvidia driver version: 510.47.03
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.11.0
[conda] torch                     1.11.0                   pypi_0    pypi
@jlin816 jlin816 added the bug Something isn't working label May 10, 2022
@jlin816 jlin816 changed the title [BUG] Seeding environment [BUG] Inconsistent environment seeding May 10, 2022
@samvelyan samvelyan self-assigned this May 10, 2022
@samvelyan
Copy link
Contributor

Hi @jlin816

Thanks for pointing this out and apologies for a late reply.

I see where the issue is. Firstly, MiniHack's seeding is slightly different from that of NLE's. The seeds argument in of a MiniHack environment assumes a list of integers which is used as a training distribution for an agent, e.g. [1, 3, 9, 27]. (Perhaps the documentation should make this clearer)

minihack/minihack/base.py

Lines 175 to 177 in 2054e7f

seeds (list or None):
A list of random seeds for sampling episodes. If none, the
entire level distribution is used. Defaults to None.

Specifically, when a reset function is called, minihack randomly samples one of the seeds, e.g. 27, (they are now stores as self._level_seeds since we treat them as levels of the same environment) and sets it using nle.seed(27, 27, False) like this

minihack/minihack/base.py

Lines 325 to 327 in 2054e7f

if self._level_seeds is not None:
seed = random.choice(self._level_seeds)
self.seed(seed, seed, reseed=False)

I understand this made the seed() function ignored if seeds was originally passed to the environment. Therefore, I added a new parameter in the reset() function of minihack called sample_seed (defaults to True). If True, the reset() function will randomly sample a level from the original list. If False, it will not do so, hence the manually setting the level seed with NLE's seed() function will work as desired.

Here is the PR #68. Please let me know if it works for you.

@samvelyan
Copy link
Contributor

With this new PR, here is how one would use the seeding functionality

import minihack, gym
env = gym.make("MiniHack-KeyRoom-Fixed-S5-v0",
    observation_keys=("pixel", "colors", "chars", "glyphs", "tty_chars"),
    seeds=[1, 3, 9, 27])

For now let's sample random episodes a few times

obs = env.reset()
env.render()
print(env.get_seeds())

This outputs

You are lucky!  Full moon tonight.










                                          |
                                    ----..|
                                    |.....|
                                    |@.(..|
                                    -------






Agent the Footpad              St:13 Dx:17 Co:13 In:13 Wi:13 Ch:9 Chaotic S:0
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0

(27, 27, False)

or perhaps

You are lucky!  Full moon tonight.








                                       ----
                                       |..|
                                       +..|
                                    ----..|
                                    |.....|
                                    |.(..@|
                                    -------






Agent the Footpad              St:14 Dx:18 Co:13 In:9 Wi:12 Ch:9 Chaotic S:0
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0

(3, 3, False)

Now when we manually set the seed and use sample_seed=False, we will get the exact level we want

env.seed(42, 42, False)
obs = env.reset(sample_seed=False)
env.render()
print(env.get_seeds())

will result in the following.

You are lucky!  Full moon tonight.








                                       ----
                                       |..|
                                       +(.|
                                    ----..|
                                    |.....|
                                    |...@.|
                                    -------






Agent the Footpad              St:18/02 Dx:18 Co:13 In:8 Wi:9 Ch:7 Chaotic S:0
Dlvl:1 $:0 HP:12(12) Pw:2(2) AC:7 Xp:1/0

(42, 42, False)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants