-
Notifications
You must be signed in to change notification settings - Fork 27
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🔀 Merge pull request #63 from RasmussenLab/developer
♻️ 🎨 📝 Developer
- Loading branch information
Showing
30 changed files
with
1,019 additions
and
146 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
version: 2 | ||
|
||
build: | ||
os: ubuntu-20.04 | ||
tools: | ||
python: "3.9" | ||
|
||
sphinx: | ||
configuration: docs/source/conf.py | ||
|
||
python: | ||
install: | ||
- requirements: docs/requirements.txt | ||
- requirements: requirements.txt | ||
- method: pip | ||
path: . |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,4 @@ | ||
sphinx==5.3.0 | ||
sphinx_rtd_theme=1.1.1 | ||
sphinx-rtd-theme | ||
sphinx-autodoc-typehints | ||
sphinxemoji |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,22 @@ | ||
.. move-dl documentation master file, created by | ||
sphinx-quickstart on Sat Nov 5 15:48:56 2022. | ||
You can adapt this file completely to your liking, but it should at least | ||
contain the root `toctree` directive. | ||
Welcome to move-dl's documentation! | ||
=================================== | ||
Welcome to MOVE's documentation! | ||
================================ | ||
|
||
.. toctree:: | ||
:hidden: | ||
:maxdepth: 1 | ||
:caption: Contents: | ||
|
||
pages/installation | ||
pages/tutorial | ||
pages/api/API | ||
install | ||
method | ||
tutorial/index | ||
|
||
MOVE (**m**\ ulti-\ **o**\ mics **v**\ ariational auto\ **e**\ ncoder) is a | ||
framework for integration of omics and other data modalities (including both | ||
categorical and continuous data). Our approach consists of training an ensemble | ||
of VAE (variational autoencoder) models and performing *in silico* perturbation | ||
experiments to identify associations across the different omics datasets. | ||
|
||
We invite you to read `our publication`_ presenting this method, or read | ||
about the method :doc:`here</method>`. | ||
|
||
.. _`our publication`: https://www.nature.com/articles/s41587-022-01520-x |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
Install | ||
======= | ||
|
||
MOVE is distributed as ``move-dl``, a Python package. | ||
|
||
It requires Python 3.9 (or later) and third-party libraries, such as `PyTorch`_ | ||
and `Hydra`_. These dependencies will be installed automatically when you | ||
install with ``pip``. | ||
|
||
Install the stable version | ||
-------------------------- | ||
|
||
We recommend installing ``move-dl`` in a fresh virtual environment. If you wish | ||
to learn how to create and manage virtual environments with Conda, please | ||
follow `these instructions`_. | ||
|
||
The latest stable version of ``move-dl`` can be installed with ``pip``. | ||
|
||
.. code-block:: bash | ||
>>> pip install move-dl | ||
Install the development version | ||
------------------------------- | ||
|
||
If you wish to install the development of ``move-dl``, create a new virtual | ||
environment, and do: | ||
|
||
.. code-block:: bash | ||
>>> pip install git+https://github.com/RasmussenLab/MOVE@developer | ||
Alternatively, you can clone ``move-dl`` from `GitHub`_ and install by | ||
running the following command from the top-level source directory: | ||
|
||
.. code-block:: bash | ||
>>> pip install -e . | ||
The ``-e`` flag installs the project in "editable" mode, so you can follow the | ||
development branch and update your installation by pulling from GitHub. | ||
|
||
.. _PyTorch: https://pytorch.org/ | ||
.. _Hydra: https://hydra.cc/ | ||
.. _GitHub: https://github.com/RasmussenLab/MOVE | ||
|
||
.. _these instructions: https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
About the method | ||
================ | ||
|
||
MOVE is based on the VAE (variational autoencoder) model, a deep learning model | ||
that transforms high-dimensional data into a lower-dimensional space (so-called | ||
latent representation). The autoencoder is made up of two neural networks: an | ||
encoder, which compresses the input variables; and a decoder, which tries to | ||
reconstruct the original input from the compressed representation. In doing so, | ||
the model learns the structure and associations between the input variables. | ||
|
||
In `our publication`_, we used this type of model to integrate different data | ||
modalities, including: genomics, transcriptomics, proteomics, metabolomics, | ||
microbiomes, medication data, diet questionnaires, and clinical measurements. | ||
Once we obtained a trained model, we exploited the decoder network to identify | ||
cross-omics associations. | ||
|
||
Our approach consists of performing *in silico* perturbations of the original | ||
data and using either univariate statistical methods or Bayesian decision | ||
theory to identify significant differences between the reconstruction with or | ||
without perturbation. Thus, we are able to detect associations between the | ||
input variables. | ||
|
||
.. _`our publication`: https://www.nature.com/articles/s41587-022-01520-x | ||
|
||
.. image:: method/fig1.svg | ||
|
||
VAE design | ||
----------- | ||
|
||
The VAE was designed to account for a variable number of fully-connected hidden | ||
layers in both encoder and decoder. Each hidden layer is followed by batch | ||
normalization, dropout, and a leaky rectified linear unit (leaky ReLU). | ||
|
||
To integrate different modalities, each dataset is reshaped and concatenated | ||
into an input matrix. Moreover, error calculation is done on a dataset | ||
basis: binary cross-entropy for binary and categorical datasets and mean squared | ||
error for continuous datasets. Each error :math:`E_i` is then multiplied by a | ||
given weight :math:`W_i` and added up to form the loss function: | ||
|
||
:math:`L = \sum_i W_i E_i + W_\textnormal{KL} D_\textnormal{KL}` | ||
|
||
Note that the :math:`D_\textnormal{KL}` (Kullback–Leibler divergence) penalizes | ||
deviance of the latent representation from the standard normal distribution. It | ||
is also subject to a weight :math:`W_\textnormal{KL}`, which warms up as the | ||
model is trained. | ||
|
||
Extracting associations | ||
----------------------- | ||
|
||
After determining the right set of hyperparameters, associations are extracted | ||
by perturbing the original input data and passing it through an ensemble of | ||
trained models. The reason behind using an ensemble is that VAE models are | ||
stochastic, so we need to ensure that the results we obtain are not a product | ||
of chance. | ||
|
||
We perturbed categorical data by changing its value from one category to | ||
another (e.g., drug status changed from "not received" to "received"). Then, we | ||
compare the change between the reconstruction generated from the original data | ||
and the perturbed data. To achieve this, we proposed two approaches: using | ||
*t*\ -test and Bayes factors. Both are described below: | ||
|
||
MOVE *t*\ -test | ||
^^^^^^^^^^^^^^^ | ||
|
||
#. Perturb a variable in one dataset. | ||
#. Repeat 10 times for 4 different latent space sizes: | ||
|
||
#. Train VAE model with original data. | ||
#. Obtain reconstruction of original data (baseline reconstruction). | ||
#. Obtain 10 additional reconstructions of original data and calculate | ||
difference from the first (baseline difference). | ||
#. Obtain reconstruction of perturbed data (perturbed reconstruction) and | ||
subtract from baseline reconstruction (perturbed difference). | ||
#. Compute p-value between baseline and perturbed differences with t-test. | ||
|
||
#. Correct p-values using Bonferroni method. | ||
#. Select features that are significant (p-value lower than 0.05). | ||
#. Select significant features that overlap in at least half of the refits and | ||
3 out of 4 architectures. These features are associated with the | ||
perturbed variable. | ||
|
||
MOVE Bayes | ||
^^^^^^^^^^ | ||
|
||
#. Perturb a variable in one dataset. | ||
#. Repeat 30 times: | ||
|
||
#. Train VAE model with original data. | ||
#. Obtain reconstruction of original data (baseline reconstruction). | ||
#. Obtain reconstruction of perturbed data (perturbed reconstruction). | ||
#. Record difference between baseline and perturbed reconstruction. | ||
|
||
#. Compute probability of difference being greater than 0. | ||
#. Compute Bayes factor from probability: :math:`K = \log p - \log (1 - p)`. | ||
#. Sort probabilities by Bayes factor, from highest to lowest. | ||
#. Compute false discovery rate (FDR) as cumulative evidence. | ||
#. Select features whose FDR is above desired threshold (e.g., 0.05). These | ||
features are associated with the perturbed variable. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.