Skip to content

Latest commit

 

History

History
112 lines (97 loc) · 4.27 KB

README.md

File metadata and controls

112 lines (97 loc) · 4.27 KB

Review Assignment Due Date Open in Visual Studio Code

EPFL Machine Learning (CS411) Project 2 Machine Learning Forcefields (ML-FFs) from Spatial Equivariant Descriptors

Team members

Method

method The training set for our ML-FF can be generated by the calculation results of ab initio methods or state-of-the-art DFT. For the given system configuration, we first generate its atom density function. The potential function V(r) can be computed. Furthermore, a series of coefficients can be obtained by projecting the potential function onto the basic functions constructed by radial basis function R(r) and spherical harmonic. These coefficients finally lead to the invariants I. The model is trained on the processed invariant descriptors and energy labels. For a given system, after being converted into the invariant descriptors, it can be input into the model to get the energy. The derivative of the output energy gives the atom force, which can be used for atom coordinates updating to conduct simulations.

Quickstart

Requirements

  • Python3.10
  • Rust (rascaline depends on rust to build)
git clone [email protected]:CS-433/ml-project-2-cross-entropy.git
cd ml-project-2-cross-entropy
pip install --upgrade pip
pip install -r requirements.txt
pip install --extra-index-url https://luthaf.fr/temporary-wheels/ metatensor
pip install git+https://github.com/lab-cosmo/equisolve
pip install git+https://github.com/Luthaf/rascaline

Datasets

You can access the dataset here. Download all .xyz and .npz files and put them into ./dataset folder.

Because the dataset is small, we include all the datasets in the repository.

Run

python run.py

Results

results

Project Organization

.
├── README.md
├── config.py
├── configs
│   └── ridge.txt
├── data
│   ├── __init__.py
│   ├── dataloader.py
│   ├── dataset.py
│   └── feature
│       ├── __init__.py
│       ├── coordinate.py
│       ├── descriptor.py
│       └── feature_base.py
├── dataset
│   ├── xe2_50.xyz
│   ├── xe2_50_x.npz
│   ├── xe2_50_y.npz
│   ├── xe3_50.xyz
│   ├── xe3_50_x.npz
│   ├── xe3_50_y.npz
│   ├── xe3_dataset_dft.xyz
│   ├── xe3_dataset_dft_x.npz
│   └── xe3_dataset_dft_y.npz
├── fig
│   ├── method.png
│   └── results.png
├── methods
│   ├── __init__.py
│   ├── base_method.py
│   ├── bayesian_method.py
│   ├── coord_based_1.ipynb
│   ├── coord_based_2.ipynb
│   ├── decision_tree_method.py
│   ├── elasticnet_method.py
│   ├── knn_method.py
│   ├── lasso_lars_method.py
│   ├── lasso_method.py
│   ├── mlp_method.py
│   ├── pca_method.py
│   ├── preprocessing
│   │   ├── __init__.py
│   │   ├── base_lining.py
│   │   ├── base_method.py
│   │   ├── identity.py
│   │   ├── methods_list.py
│   │   ├── normalization.py
│   │   ├── pca.py
│   │   ├── shift.py
│   │   └── standardization.py
│   ├── random_forest_method.py
│   └── ridge_method.py
├── radial_basis.py
├── requirements.txt
├── run.ipynb
├── run.py
└── utils
    ├── __init__.py
    ├── energy_util.py
    └── visualize.py

Acknowledgements

The authors thank Philip Robin Loche and Kevin Kazuki Huguenin-Dumittan for their guidance and useful discussions.