Implementation of "Calibrated One-class classification-based Unsupervised Time series Anomaly
detection" (COUTA for short).
The full paper is available at link.
Please consider citing our paper if you use this repository. 😉
@article{xu2022deep,
title={Calibrated One-class Classification for Unsupervised Time Series Anomaly
Detection},
author={Xu, Hongzuo and Wang, Yijie and Jian, Songlei and Liao, Qing and Wang, Yongjun and Pang, Guansong},
journal={arXiv preprint arXiv:2207.12201},
year={2022}
}
main packages
torch==1.10.1+cu113
numpy==1.20.3
pandas==1.3.3
scipy==1.4.1
scikit-learn==1.1.1
we provide a requirements.txt
in our repository.
COUTA provides easy APIs in a sklearn/pyod style, that is, we can first instantiate the model class by giving the parameters
from src.algorithms.couta_algo import COUTA
model_configs = {'sequence_length': 50, 'stride': 1}
model = COUTA(**model_configs)
then, the instantiated model can be used to fit and predict data, please use dataframes of pandas as input data
model.fit(train_df)
score_dic = model.predict(test_df)
score = score_dic['score_t']
We use a dictionary as our prediction output for the sake of consistency with an evaluation work of time series anomaly detection link
score_t
is a vector that indicates anomaly scores of each time observation in the testing dataframe, and a higher value represents a higher likehood to be an anomaly
Training by feeding the save_model_path
parameter, the model will be saved in this path
from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'sequence_length': 50, 'stride': 1, 'save_model_path': path}
model = COUTA(**model_configs)
model.fit(train_df)
Then, couta can be used without fitting.
from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'load_model_path': path}
model = COUTA(**model_configs)
model.predict(test_df)
- Due to the license issue of these datasets, we provide download links here. We also offer the preprocessing script in
data_preprocessing.ipynb
. You can easily generate processed datasets that can be directly fed into our pipeline by downloading original data and running this notebook. *
The used datasets can be downloaded from:
- ASD https://github.com/zhhlee/InterFusion
- SMD https://github.com/NetManAIOps/OmniAnomaly
- SWAT https://itrust.sutd.edu.sg/itrust-labs_datasets
- WaQ https://www.spotseven.de/gecco/gecco-challenge
- DSADS https://github.com/zhangyuxin621/AMSL
- Epilepsy https://github.com/boschresearch/NeuTraL-AD/
After handling the used datasets, you can use main.py
to perform COUTA on different time series datasets, we use six datasets in our paper, and --data
can be chosen from [ASD, SMD, SWaT, WaQ, Epilepsy, DSADS]
.
For example, perform COUTA on the ASD dataset by
python main.py --data ASD --algo COUTA
or you can directly use script_effectivenss.sh
we include the used synthetic datasets in data_processed/
python main_showcase.py --type point
python main_showcase.py --type pattern
two anomaly score npy
files are generated, you can use experiment_generalization_ability.ipynb
to visualize the data and our results.
use src/experiments/data_contaminated_generator_dsads.py
and src/experiments/data_contaminated_generator_ep.py
to generate datasets with various contamination ratios
use main.py
to perform COUTA on these datasets, or directly execute script_robustness.sh
change the --algo
argument to COUTA_wto_umc
, COUTA_wto_nac
, or Canonical
, e.g.,
python main.py --algo COUTA_wto_umc --data ASD
use script_effectiveness.sh
also produce detection results of ablated variants
As for the sensitivity test (4.6), please adjust the parameters in the yaml file.
As for the scalability test (4.7), the produced result files also contain execution time.
All of the anomaly detectors in our paper are implemented in Python. We list their publicly available implementations below.
OCSVM
andECOD
: we directly use pyod (python library of anomaly detection approaches);GOAD
: https://github.com/lironber/GOADDSVDD
: https://github.com/lukasruff/Deep-SVDD-PyTorchUSAD
: https://github.com/hoo2257/USAD-Anomaly-Detecting-AlgorithmGDN
: https://github.com/d-ailin/GDNNeuTraL
: https://github.com/boschresearch/NeuTraL-ADTranAD
: https://github.com/imperial-qore/TranADLSTM-ED
,Tcn-ED
,MSCRED
andOmni
: https://github.com/astha-chem/mvts-ano-eval/