Skip to content

Latest commit

 

History

History
193 lines (112 loc) · 8.94 KB

README.md

File metadata and controls

193 lines (112 loc) · 8.94 KB

ETSC: Early Time Series Classification

ETSC is a Python Early Time-Series Classification library for public use, used in "A Framework to Evaluate Early Time-Series Classification Algorithms", Authors: Charilaos Akasiadis, Evgenios Kladis, Petro-Foti Kamberi, Evangelos Michelioudakis, Elias Alevizos, Alexander Artikis.

Cite as:

Akasiadis, C., Kladis, E., Kamberi, P. F., Michelioudakis, E., Alevizos, E., & Artikis, A. (2024). A Framework to Evaluate Early Time-Series Classification Algorithms. EDBT 2024: 27th International Conference on Extending Database Technology, Proceedings (pp. 623–635). ISBN 978-3-89318-095-0 on OpenProceedings.org

The aim of this work is to study and collect algorithms that conduct early time-series classification, in a user-friendly format, for researchers to use as a benchmark.

Currently, six algorithms are included in this directory. A python cli, simplifies the execution of each algorithm The predictions are evaluated through metrics such as earliness, accuracy, f1-score, harmonic mean between accuracy and earliness, and computation time for both training and testing.

Acknowledgments

We would like to thank the creators of the UCR/UEA repository for making the datasets openly available. Special thanks to Evangelos Michelioudakis ([email protected]) for the contribution to the development of this repository.

License

This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; See the GNU General Public License v3 for more details.

Requirements

Python3 is required to install the libraries stated in the requirements.txt.

JVM >= 1.8 is required to run the algorithms that are implemented using java.

Run in anaconda environment

  1. Create the environment
conda create -n py37 python=3.7
  1. Activate it
conda activate py37
  1. Install required packages:
pip3 install -r requirements.txt
  1. Locally install timeline:
pip install --editable .

Run in virtualenv

  1. Install the virtualenv package:
pip3 install virtualenv
  1. Create a new virtual environment:
virtualenv venv
  1. Activate virtual environment:
. venv/bin/activate
  1. Install required packages:
pip3 install -r requirements.txt
  1. Locally install timeline:
pip install --editable .

Downloading the data

For downloading the data run the script download_data.sh found in the script folder. The downloaded data can be found inside folder data. 10 datasets are available, derived from the UCR_UEA library. Multivariate datasets from the Biological and Maritime field are also provided.

Experimental Setup

Note that only ECTS was implemented by us, using the paper of the algorithm as a guide. The rest of the algorithms derive from sources we provide in the following table. All credit goes to the original creators of the algorithms papers.

Algorithm Parameters
ECTS [paper] support = 0
EDSC [paper] [code] CHE k=3, min_length=5, max_length=len(time_series)/2
TEASER [paper] [code] S=20 (for the UCR_UEA), S=10 (for the biological and maritime)
ECEC [paper] [code] training_times=20, length = len(time_series)/20,a=0.8
MLSTM [paper] [code] LSTM cells = [8, 64, 128], tested_lengths = [0.4,0.5,0.6] %
ECONOMY-K [paper] [code] k = [1, 2, 3], λ = 100, cost = 0.001

Menu Guide

After running the Virtual Enviroment commands stated above, by running ets a menu with all programming options appears. A running command is constructed as follows:

ets <program commands> <algorithm> <algorithm commands>

If you want to see the algorithm's menu run:

ets <program commands> <algorithm> --help

Quick commands rundown used for the experiments

-i <file path> : Only one file is given for cross validation with a given number of folds.

-t <file-path> : The training file used. A -e command is also required.

-e <file-path> : The testing file used. A -t command is also required.

-o <file-path> : The desired output stream file. Default output steam is the console.

-s <char>: The seperator of each collumn in the file/s.

-d & -h: Commands that indicate the collumn of the classes in the input file/s. It can be either the <int> of the collumn for -d or the <name> for -h.

-v <int>: In case of multivariate input, describes the number of variables and should always be followed by -g. All Multivariate input files, each time-series, should take up -v consequent lines for each univariate time-series variable, bearing the same labels

-g <method>: The methods used to deal with multivariate time-series. We used vote which conducts the voting as explained in the paper and normal which passes the whole multivariate input in the algorithm, currently possible only by MLSTM. Also MLSTM requires -g normal for univariate time-series as well.

--java & --cplus: Command that is required for non-python implementations. --java for Teaser and ECEC,--cplus for EDSC.

-c <number>: The class for which the F1-score will be calculated. If -1 is passed then the F1-score of all classes is calculated (not supported for multivariate time-series yet).

--make-cv: Takes the training and testing file, merges them and conducts cross validation.

--folds : Used when there are premade folds available.

--trunc : Use STRUT approach to find the best time-point to perform ETSC.

--pyts-csv: Use pyts format for STRUT Weasel's input, when the dataset comes in csv format.

Test Run for UCR_UEA

ects : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote ects -u 0.0

edsc : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --cplus -g vote edsccplus

ecec : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote ecec

teaser : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote teaser -s 20

mlstm : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g normal mlstm

eco-k : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote economy-k

strut - minirocket : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket -p 0 -s 2

strut - weasel : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel -p 0 -s 2

strut - minirocket-fav : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket_fav -p 0 -s 2

strut - weasel-fav : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel_fav -p 0 -s 2

Test Run for Maritime and Biological

ects : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 ects -u 0.0

edsc : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --cplus edsccplus

ecec : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java ecec

teaser : ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java teaser -s 10

mlstm : ets -i "file location" -v (3 for Biological or 7 Maritime) -d 0 -c -1 -g normal mlstm

eco-k : ets -i "file location"" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 economy-k

strut - minirocket : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket -p 0 -s 2

strut - weasel : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel -p 0 -s 2

strut - minirocket-fav : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket_fav -p 0 -s 2

strut - weasel-fav : ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel_fav -p 0 -s 2

Disclaimer

Any false product and misuse of the used algorithms is on the authors of the original papers. Please, inform us if you detect any misconduct or misuse of the code/datasets used in this repository.