Framework for the AE reconstruction and feature based anomaly detection
This repository implements the reconstruction-based anomaly detection method presented in the article Toward phytoplankton parasite detection using autoencoders. The proposed technique utilizes an autoencoder trained on the non-anomalous (OK) data, which is later used to reconstruct dataset containing both OK and anomalous (NOK) data. Various features are extracted using a comparison between the original and reconstructed data, which are classified under an assumption, that the difference will be more significant in case of the NOK data. The scheme is shown in the figure bellow:
One-class classification is performed using the standard Scikit libraries based on this example. The anomaly detection threshold for each technique is derived using the equal error rate on the ROC curve as shown in the figure bellow:
The framework implements five autoencoders cores and six convolutional encoder-decoder pairs of various complexity and depth. The autoencoder cores are:
- Basic core 1 (BAE1): a direct connection of the encoder-decoder pair.
- Basic core 2 (BAE2): a basic scheme with the inserted fully-connected layers.
- Variational core 1 (VAE1): a basic variational autoencoder derived from Keras Example.
- Variational core 2 (VAE2): VAE1 scheme with the inseted fully-connected layers.
- Vector-quantised core (VQVAE1): vector-quantised autoencoder derived from Keras Example.
Modifications of the BAE2 and VAE2 cores are illustrated on the figure bellow:
-
Run pip install requirements.txt within the fresh conda Python 3.9. enviroment. Reinstall the TF2 following the instructions at: https://www.tensorflow.org/install/pip
-
Download the V2 dataset at: Industry Biscuit (Cookie) dataset and run the attached script DatasetFolder.py to get a folder structured dataset. For the expriments, we used training dataset of 1000 OK samples, validation dataset of 500 OK samples and test dataset of 200 OK and 200 NOK samples.
-
Set the evaluation flag in the script FrameworkTrain.py and run it to train the selected autoencoders. Models has to be trained before evaluation and each model has to have a corresponding ini file in the init directory. All those files will be processed by the main function and this script will automatically create subdirectory in data to store the model weights and evaluation results.
-
Use the script FrameworkEvaluate.py to perform evaluation on the unknown data. Set the saveImgToFile argument to True if you would like to have your data sorted as OK / NOK.
-
Create custom models in the modules ModelSaved.py and ModelLayers.py. Create corresponding ini files in the init directory.
This repository is still under development and it is based the research presented in the following repositories:
Please cite the article Toward phytoplankton parasite detection using autoencoders in your further work:
@inproceedings{BUT171163,
@Article{Bilik2023,
author={Bilik, Simon
and Batrakhanov, Daniel
and Eerola, Tuomas
and Haraguchi, Lumi
and Kraft, Kaisa
and Van den Wyngaert, Silke
and Kangas, Jonna
and Sj{\"o}qvist, Conny
and Madsen, Karin
and Lensu, Lasse
and K{\"a}lvi{\"a}inen, Heikki
and Horak, Karel},
title={Toward phytoplankton parasite detection using autoencoders},
journal={Machine Vision and Applications},
year={2023},
month={Sep},
day={13},
volume={34},
number={6},
pages={101},
issn={1432-1769},
doi={10.1007/s00138-023-01450-x},
url={https://doi.org/10.1007/s00138-023-01450-x}
}