[BUG] `reward_lose` not correctly set at timeout #3

samvelyan · 2025-02-12T21:33:25Z

This is a migration copy of the original bug report here: facebookresearch#103.

When setting max_episode_steps, while gym correctly triggers a reset, MiniHack (and perhaps even NLE) does not set StepStatus.ABORTED.

As a consequence, the reward for Timeout is not reward_lose.

To Reproduce

import gym
import gym.vector
import minihack

MAX_STEPS = 20

env = gym.make(
    "MiniHack-KeyRoom-S5-v0",
    reward_lose=-1.0,
    max_episode_steps=MAX_STEPS,
)
timestep = env.reset()
env.render()


for i in range(MAX_STEPS):
    timestep = env.step(3) 
    reward = timestep[1]
    info = timestep[-1]
    print(reward, info["end_status"])

assert int(info["end_status"]) == -1
Expected behavior
int(info["end_status"]) should be -1

Potential reasons

gym.make accepts max_episode_steps as argument.
For this reason, max_episode_steps is not included in kwargs and it is not passed through to the MiniHack constructor.

The text was updated successfully, but these errors were encountered:

samvelyan · 2025-02-12T21:34:24Z

Copying a relevant comment from the same Issue.

You can temporarily patch it with this

class PatchTimeoutWrapper(gym.Wrapper):
    def __init__(self, env: gym.Env):
        super().__init__(env)
        self.unwrapped._max_episode_steps = self._max_episode_steps

And use:

env = gym.make(
    "MiniHack-KeyRoom-S5-v0",
    reward_lose=-1.0,
    max_episode_steps=MAX_STEPS,
)
env = PatchTimeoutWrapper(env)

samvelyan added the bug Something isn't working label Feb 12, 2025

mahnerak mentioned this issue Feb 12, 2025

[BUG] reward_lose not correctly set at timeout facebookresearch/minihack#103

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `reward_lose` not correctly set at timeout #3

[BUG] `reward_lose` not correctly set at timeout #3

samvelyan commented Feb 12, 2025

samvelyan commented Feb 12, 2025

[BUG] reward_lose not correctly set at timeout #3

[BUG] reward_lose not correctly set at timeout #3

Comments

samvelyan commented Feb 12, 2025

To Reproduce

Potential reasons

samvelyan commented Feb 12, 2025

[BUG] `reward_lose` not correctly set at timeout #3

[BUG] `reward_lose` not correctly set at timeout #3