Transformers are often the go-to architecture to build foundation models that ingest a large amount of training data. But these models do not estimate the probability density distribution when trained on regression problems, yet obtaining full probabilistic outputs is crucial to many fields of science, where the probability distribution of the answer can be non-Gaussian and multimodal. In this work, we demonstrate that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation even for high-dimensional inputs. The combined Transformer+Denoising Diffusion model allows conditioning the output probability density on arbitrary combinations of inputs and it is thus a highly flexible density function emulator of all possible input/output combinations. We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy and we apply it to a variety of inference tasks to show that the model can infer labels accurately with reasonable distributions.
Table of Contents
This repository is to make sure all figures and results are reproducible by anyone easily for this paper🤗.
If Github has issue (or too slow) to load the Jupyter Notebooks, you can go http://nbviewer.jupyter.org/github/henrysky/stars_foundation_diffusion/tree/main/
Python dependencies are listed in requirements.txt.
⚠️ You have to setmagicnumber = nan
inastroNN
configuration file for the data reduction code to work properly.
Datasets are available on Zenodo and should be placed in the folder named data_files
under the root directory of this repository.
If you are planning to use the Docker image, the data files are already downloaded and placed in the correct folder in the container.
If you have Docker installed, you can use the Dockerfile to build a Docker image upon Pytorch container from NVIDIA NGC Catalog with all dependencies installed and data files downloaded.
To build the Docker image called stars_foundation_diffusion
, run the following command in the root directory of this repository:
docker build -t stars_foundation_diffusion .
To run the Docker container with all GPU available to the container named testing123
, run the following command:
docker run --gpus all --name testing123 -it -e SHELL=/bin/bash --entrypoint bash stars_foundation_diffusion
Then you can attach to the container by running:
docker exec -it testing123 bash
Now you can run all notebooks or training script inside the container
- Dataset_Reduction.ipynb (External Repository)The notebook contains code to generate the dataset used by this paper.Terabytes of (mostly gaia) data need to be downloaded in the process to construct the datasets.
- The notebook contains code to train a simple denoising diffusion model
- The notebook contains code to train a simple conditional denoising diffusion model
- The notebook contains code to do inference
- The notebook contains code to train a model on California housing dataset for demonstration purpose.
If you use this training script to train your own model, please notice that details of your system will be
saved automatically in the model folder as training_system_info.txt
for developers to debug should anything went wrong.
Delete the file before you share your model with others if you concern about privacy.
- Python script to train the model.
To train the model with mixed precision and torch.compile()
, run the following command in the root directory of this repository:
python training.py --mixed_precision --compile_model
To see all available arguments, run:
python training.py --help
model_torch
is a trained PyTorch modelThe model has ~3.7 millions parameters for the papertrained_california_model
is a trained PyTorch modelThe model has 20640 parameters trained on California housing dataset for demonstration purpose
All these graphics can be opened and edited by draw.io.
- Source for Figure 1 in the paper,
- Henry Leung - henryskyDepartment of Astronomy and Astrophysics, University of TorontoContact Henry: henrysky.leung [at] utoronto.ca
- Jo Bovy - jobovyDepartment of Astronomy and Astrophysics, University of Toronto
- Josh Speagle - joshspeagleDepartment of Astronomy and Astrophysics, University of Toronto
This project is licensed under the MIT License - see the LICENSE file for details