Skip to content
/ ETS Public

An Early Time-Series Classification Suite For Benchmarking made for the use of CER group, Institute of Informatics, NCSR "Demokritos"

Notifications You must be signed in to change notification settings

Eukla/ETS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Newest version

The work in this repo is further developed and mantained in the repo https://github.com/xarakas/ETSC.

ETSC: Early Time Series Classification

ETSC is a Python Early Classification of Time-Series library for public use, from the work "Evaluation of Early Time-Series Classification Algorithms", Authors: Charilaos Akasiadis, Evgenios Kladis, Evangelos Michelioudakis, Elias Alevizos, Alexander Artikis.

Aim of this work is to study and collect algorithms that conduct early time-series classification, in a user-friendly format, for researchers to use for their work.

Currently six algorithms are included in this directory. A python cli, simplifies the execution of each algorithm The predictions are evalueated through metrics such as earliness, accuracy, f1-score(if wanted) and computation time for both training and testing.

Acknowledgments

Special thanks to Evangelos Michelioudakis ([email protected]) for the contribution to the development of this repository.

License

This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions; See the GNU General Public License v3 for more details.

Requirements

Python3 is required to install the libraries stated in the requirements.txt.

JVM >= 1.8 is required to run the algorithms that are implemented using java.

Installation

  1. Install the virtualenv package:
pip3 install virtualenv
  1. Create a new virtual environment:
virtualenv venv
  1. Activate virtual environment:
. venv/bin/activate
  1. Install required packages:
pip3 install -r requirements.txt
  1. Locally install timeline:
pip install --editable .

Downloading the data

For downloading the data run the script download_data.sh found in the script folder. The downloaded data can be found inside folder data. 10 datasets are available, derived from the UCR_UEA library. Multivariate datasets from the Biological and Maritime field are also provided.

Experimental Setup

Note that only ECTS was implemented by us, using the paper of the algorithm as a guide. The rest of the algorithms derive from sources we provide in the following table. All credit goes to the original creators of the algorithms papers.

Algorithm Parameters
ECTS [paper] support = 0
EDSC [paper] [code] CHE k=3, min_length=5, max_length=len(time_series)/2
TEASER [paper] [code] S=20 (for the UCR_UEA), S=10 (for the biological and maritime)
ECEC [paper] [code] training_times=20, length = len(time_series)/20,a=0.8
MLSTM [paper] [code] LSTM cells = [8, 64, 128], tested_lengths = [0.4,0.5,0.6] %
ECONOMY-K [paper] [code] k = [1, 2, 3], λ = 100, cost = 0.001

Menu Guide

After running the Virtual Enviroment commands stated above, by running ets a menu with all programming options appears. A running command is constructed as follows:

ets <program commands> <algorithm> <algorithm commands>

If you want to see the algorithm's menu run:

ets <program commands> <algorithm> --help

Quick commands rundown used for the experiments

-i <file path> : Only one file is given for cross validation with a given number of folds.

-t <file-path> : The training file used. A -e command is also required.

-e <file-path> : The testing file used. A -t command is also required.

-o <file-path> : The desired output stream file. Default output steam is the console.

-s <char>: The seperator of each collumn in the file/s.

-d & -h: Commands that indicate the collumn of the classes in the input file/s. It can be either the <int> of the collumn for -d or the <name> for -h.

-v <int>: In case of multivariate input, describes the number of variables and should always be followed by -g. All Multivariate input files, each time-series, should take up -v consequent lines for each univariate time-series variable, bearing the same labels

-g <method>: The methods used to deal with multivariate time-series. We used vote which conducts the voting as explained in the paper and normal which passes the whole multivariate input in the algorithm, currently possible only by MLSTM. Also MLSTM requires -g normal for univariate time-series as well.

--java & --cplus: Command that is required for non-python implementations. --java for Teaser and ECEC,--cplus for EDSC.

-c <number>: The class for which the F1-score will be calculated. If -1 is passed then the F1-score of all classes is calculated (not supported for multivariate time-series yet).

--make-cv: Takes the training and testing file, merges them and conducts cross validation.

--folds : Used when there are premade folds available.

Test Run for UCR_UEA

ects : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote ects -u 0.0

edsc : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --cplus -g vote edsccplus

ecec : ets -t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote ecec

teaser : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 --java -g vote teaser -s 20

mlstm : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g normal mlstm

eco-k : ets t "training file name" -e "testing file name" --make-cv -h Class -c -1 -g vote economy-k

Test Run for Maritime and Biological

ects : ets -i "file location" -g vote -v (3 for Biological or 5 Maritime) -d 0 -c -1 ects -u 0.0

edsc : ets -i "file location" -g vote -v (3 for Biological or 5 Maritime) -d 0 -c -1 --cplus edsccplus

ecec : ets -i "file location" -g vote -v (3 for Biological or 5 Maritime) -d 0 -c -1 --java ecec

teaser : ets -i "file location" -g vote -v (3 for Biological or 5 Maritime) -d 0 -c -1 --java teaser -s 10

mlstm : ets -i "file location" -v (3 for Biological or 5 Maritime) -d 0 -c -1 -g normal mlstm

eco-k : ets -i "file location"" -g vote -v (3 for Biological or 5 Maritime) -d 0 -c -1 economy-k

Disclaimer

Any false product and misuse of the used algorithms is on the authors of the paper. Please inform us if you detect any misconduct or misuse of the code/datasets used in this repository.

About

An Early Time-Series Classification Suite For Benchmarking made for the use of CER group, Institute of Informatics, NCSR "Demokritos"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages