ETSC
is a Python Early Time-Series Classification library for public use, used in "A Framework to Evaluate Early Time-Series Classification
Algorithms", Authors: Charilaos Akasiadis, Evgenios Kladis, Petro-Foti Kamberi, Evangelos Michelioudakis, Elias Alevizos, Alexander Artikis.
Cite as:
Akasiadis, C., Kladis, E., Kamberi, P. F., Michelioudakis, E., Alevizos, E., & Artikis, A. (2024). A Framework to Evaluate Early Time-Series Classification Algorithms. EDBT 2024: 27th International Conference on Extending Database Technology, Proceedings (pp. 623–635). ISBN 978-3-89318-095-0 on OpenProceedings.org
The aim of this work is to study and collect algorithms that conduct early time-series classification, in a user-friendly format, for researchers to use as a benchmark.
Currently, six algorithms are included in this directory. A python cli, simplifies the execution of each algorithm The predictions are evaluated through metrics such as earliness, accuracy, f1-score, harmonic mean between accuracy and earliness, and computation time for both training and testing.
We would like to thank the creators of the UCR/UEA repository for making the datasets openly available. Special thanks to Evangelos Michelioudakis ([email protected]) for the contribution to the development of this repository.
This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; See the GNU General Public License v3 for more details.
Python3 is required to install the libraries stated in the requirements.txt
.
JVM >= 1.8 is required to run the algorithms that are implemented using java.
- Create the environment
conda create -n py37 python=3.7
- Activate it
conda activate py37
- Install required packages:
pip3 install -r requirements.txt
- Locally install
timeline
:
pip install --editable .
- Install the
virtualenv
package:
pip3 install virtualenv
- Create a new virtual environment:
virtualenv venv
- Activate virtual environment:
. venv/bin/activate
- Install required packages:
pip3 install -r requirements.txt
- Locally install
timeline
:
pip install --editable .
For downloading the data run the script download_data.sh
found in the script folder. The downloaded data can be found inside folder data
.
10 datasets are available, derived from the UCR_UEA library. Multivariate datasets from the Biological and Maritime field are also provided.
Note that only ECTS was implemented by us, using the paper of the algorithm as a guide. The rest of the algorithms derive from sources we provide in the following table. All credit goes to the original creators of the algorithms papers.
Algorithm | Parameters |
---|---|
ECTS [paper] | support = 0 |
EDSC [paper] [code] | CHE k=3, min_length=5, max_length=len(time_series)/2 |
TEASER [paper] [code] | S=20 (for the UCR_UEA), S=10 (for the biological and maritime) |
ECEC [paper] [code] | training_times=20, length = len(time_series)/20,a=0.8 |
MLSTM [paper] [code] | LSTM cells = [8, 64, 128], tested_lengths = [0.4,0.5,0.6] % |
ECONOMY-K [paper] [code] | k = [1, 2, 3], λ = 100, cost = 0.001 |
After running the Virtual Enviroment commands stated above, by running ets
a menu with all programming options appears.
A running command is constructed as follows:
ets <program commands> <algorithm> <algorithm commands>
If you want to see the algorithm's menu run:
ets <program commands> <algorithm> --help
-i <file path>
: Only one file is given for cross validation with a given number of folds.
-t <file-path>
: The training file used. A -e
command is also required.
-e <file-path>
: The testing file used. A -t
command is also required.
-o <file-path>
: The desired output stream file. Default output steam is the console.
-s <char>
: The seperator of each collumn in the file/s.
-d
& -h
: Commands that indicate the collumn of the classes in the input file/s. It can be either the <int>
of the collumn for -d
or the <name>
for -h
.
-v <int>
: In case of multivariate input, describes the number of variables and should always be followed by -g
. All Multivariate input files, each time-series, should take up -v
consequent lines for each univariate time-series variable, bearing the same labels
-g <method>
: The methods used to deal with multivariate time-series. We used vote
which conducts the voting as explained in the paper and normal
which passes the whole multivariate input in the algorithm, currently possible only by MLSTM. Also MLSTM requires -g normal
for univariate time-series as well.
--java
& --cplus
: Command that is required for non-python implementations. --java
for Teaser and ECEC,--cplus
for EDSC.
-c <number>
: The class for which the F1-score will be calculated. If -1 is passed then the F1-score of all classes is calculated (not supported for multivariate time-series yet).
--make-cv
: Takes the training and testing file, merges them and conducts cross validation.
--folds
: Used when there are premade folds available.
--trunc
: Use STRUT approach to find the best time-point to perform ETSC.
--pyts-csv
: Use pyts format for STRUT Weasel's input, when the dataset comes in csv format.
ects
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote ects -u 0.0
edsc
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --cplus -g vote edsccplus
ecec
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote ecec
teaser
: ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote teaser -s 20
mlstm
: ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g normal mlstm
eco-k
: ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote economy-k
strut - minirocket
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket -p 0 -s 2
strut - weasel
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel -p 0 -s 2
strut - minirocket-fav
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m minirocket_fav -p 0 -s 2
strut - weasel-fav
: ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --trunc strut -m weasel_fav -p 0 -s 2
ects
: ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 ects -u 0.0
edsc
: ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --cplus edsccplus
ecec
: ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java ecec
teaser
: ets -i "file location" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 --java teaser -s 10
mlstm
: ets -i "file location" -v (3 for Biological or 7 Maritime) -d 0 -c -1 -g normal mlstm
eco-k
: ets -i "file location"" -g vote -v (3 for Biological or 7 Maritime) -d 0 -c -1 economy-k
strut - minirocket
: ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket -p 0 -s 2
strut - weasel
: ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel -p 0 -s 2
strut - minirocket-fav
: ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --trunc strut -m minirocket_fav -p 0 -s 2
strut - weasel-fav
: ets -i "file location"" -v (3 for Biological or 7 Maritime) -d 0 -c -1 --pyts-csv --trunc strut -m weasel_fav -p 0 -s 2
Any false product and misuse of the used algorithms is on the authors of the original papers. Please, inform us if you detect any misconduct or misuse of the code/datasets used in this repository.