Skip to content

Commit

Permalink
Merge branch 'rt1'
Browse files Browse the repository at this point in the history
  • Loading branch information
Shreyas-S-Raman committed Jun 20, 2024
2 parents 78c0f63 + a6e92a2 commit c3c4f99
Show file tree
Hide file tree
Showing 510 changed files with 61,638 additions and 0 deletions.
139 changes: 139 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,142 @@ __pycache__/
!/models/main_models/alfred/data/vis_feats/.gitkeep
/dataset/*
!/dataset/.gitkeep

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Extras
models/main_models/checkpoints/*
models/main_models/datasets/*
models/main_models/wandb/
models/main_models/rt1_dataset/*
models/main_models/traj_rollouts/*
models/main_models/run_rt1_main.sh
models/main_models/rt1_ft_env
21 changes: 21 additions & 0 deletions models/main_models/rt1/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2022 Phil Wang

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
2,303 changes: 2,303 additions & 0 deletions models/main_models/rt1/Open_X_Embodiment_Datasets.ipynb

Large diffs are not rendered by default.

66 changes: 66 additions & 0 deletions models/main_models/rt1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# NPM-Dataset
A comprehensive robotics dataset that includes navigation, perception, and manipulation data per data point.

# RT-1 (Robotic Transformer) PyTorch Implementation
<img src="./figures/rt1.png" width="450px"></img>

A forked implementation of <a href = "https://github.com/Rohan138/rt1-pytorch.git"> RT1 (Robotic Transformer) </a> originally inspired by the <a href="https://ai.googleblog.com/2022/12/rt-1-robotics-transformer-for-real.html"> Google Research </a> paper.

This implemenetation of RT-1 was pretrained on the <a href="https://sites.google.com/view/bridgedata"> Bridge </a> dataset and further fine-tuned on our LaNMP dataset for evaluation. Please find details of the repository below

## Setup Instructions

```bash
git clone https://github.com/h2r/NPM-Dataset.git
git checkout -b rt1
pip install -e .
```

## Overview of files

This repository has 7 critical files/folders whose use cases are described below

1) ```main.py```: used to pretrain RT-1 on the bridge dataset. Modifying this file to accomodate different datasets requires changing the ```observation_space``` and ```action_space``` according to the dataset being loaded, as well as changing the dataset keys in ```rt1_pytorch/tokenizers/action_tokenizer.py```. Running this file saves a series of checkpoints and logs losses using weights and biases
2) ```main_ft.py```: used to finetune RT-1 on the LaNMP dataset. This file has the ```observation_space``` and ```action_space``` and PyTorch ```DataLoader``` already modified to accomodate for the LaNMP dataset finetuning (AI2Thor). Running this file saves a series of checkpoints and logs losses using weights and biases
3) ```main_ft_eval.py```: used to run RT-1 in inference mode on the LaNMP dataset. This file has the ```observation_space``` and ```action_space``` and PyTorch ```DataLoader``` already modified to accomodate for the LaNMP dataset (AI2Thor). The file iterates/loads all saved checkpoints from finetuning and runs RT-1 on inference mode for the validation dataset on each checkpoint. The script logs the test losses using weights and biases
4) ```ai2thor_env.py```: contains a Gym environment style class to load and take steps in AI2Thor enivironment. This file is used to generate real-time trajectories based on the action tokens generated by a finetuned RT-1 model (specific for AI2Thor). The main ```step()``` function takes/executes the generated action by RT-1 and returns a success message along with information about the environment state e.g. object or agent metadata, which can be saved to capture the trajectory taken by the agent for a given task
5) ```rollout_ai2thor.py```: interfaces between the finetuned RT-1 model (from a loaded checkpoint after finetuning on LaNMP) and the ```ai2thor_env.py``` Gym environment, in order to send observations from the AI2Thor environment to RT-1 and execute proposed action tokens by RT-1 on AI2Thor. Note that this file should not be run on a headless machine since it requires/deploys AI2Thor simulator GUI
6) ```rt1_pytorch/rt1_policy.py```: contains the RT-1 model implementation in PyTorch. The ```loss()``` function performs forward pass of RT-1 for training and ```act()``` function performs the forward pass during inference.
7) ```lanmp_dataloader/rt1_dataloader.py```: contains the ```DatasetManager``` class that extracts trajectories from the LaNMP ```sim_data.hdf5``` dataset file. The script automatically separates train and validation subsets according to different splits e.g. k-fold by scene, task wise or for diversity ablation. The ```DatasetManager``` also handles tokenizing/detokenizing the raw trajectory data into 256 discrete buckets, whilst also chunking trajectories across non-overlapping window lengths of 6 steps

## Details about file arguments

Most relevant files in this repository accept the same set of arguments that are detailed below
* ```dataset```: only for the ```main.py``` file, specifies the dataset on which the RT-1 model should be pretrained
* ```train-split```: specifies what fraction of the loaded dataset should be used for training v.s. evaluation
* ```eval-split```: specifies what fraction of the laoded dataset should be used for evaluation v.s. training
* ```epochs```: total number of passes over the all batches of the training set
* ```lr```: learning rate for cross-entropy loss of RT1
* ```train-batch-size```: the number of trajectories from which to sample data for the current training batch
* ```eval-batch-size```: the number of trajectories from which to sample data for the current evaluation batch
* ```trajectory-length```: the window size (context history of ```trajecotry-length``` previous images) used for each trajectory when feeding data to RT-1 model; this is set to 6 based on the RT-1 implementation
* ```sentence-transformer```: the language embedding to apply on the language-specified task
* ```device```: the device to load the model/data onto during training/inference
* ```eval-freq```: the interval of batches at which to run evaluation/inference on the validation dataset (currently set to 0 in ```main_ft.py```)
* ```checkpoint-freq```: the interval of batches at which to save a checkpoint during training
* ```checkpoint-dir```: the directory path at which to save a checkpoint during training
* ```load-checkpoint```: (optional) path of the pretrained checkpoint to load for further fine-tuning
* ```wandb```: boolean determining if logging to weights and biases should happen
* ```eval-scene```: the AI2Thor scene number in the dataset that is held out of the training set for evaluation during k-fold cross validation across scenes
* ```split-type```: determines the split type (i.e. k-fold by scene, task wise or diversity ablation) between train and evaluation used by the ```DatasetManager``` in ```rt1_dataloader.py```
* ```num-diversity-scenes```: only if ```split-type``` is ```diversity-ablation```, this is used to determine the total number of scenes to perform diversity ablation over i.e. maximum of 4 for LaNMP simulation data
* ```max-diversity-trajectories```: only if ```split-type``` is ```diversity-ablation```, this is used to determine the total number of trajectories that are divided evenly across the number of ```num-diversity-scenes``` scenes
* ```train-subbatch```: the batch size to use during training/finetuning
* ```eval-subbatch```: the batch size to use during evaluation

## Checkpoint samples

Please find the follow checkpoints samples that can be loaded to the RT-1 model. These can be found on the supplementary Google Drive associated with this project
* ```sample_checkpoints/pretrained_bridge```: the final checkpoint saved when pretraining the RT-1 model on the Bridge dataset
* ```sample_checkpoints/task_gen```: the final checkpoint saved after finetuning RT-1 model on the task-wise split for the task generalization experiment

## Additional notes

When running any of the finetuning or pretraining scripts, please ensure the following modules are loaded
```module load cuda/11.8.0-lpttyok```
```module load cudnn/8.7.0.84-11.8-lg2dpd5```
Loading

0 comments on commit c3c4f99

Please sign in to comment.