The project implements an activity recognition system trained on the UCF101 dataset (30 classes involved).
- A full pipeline from Dense Trajectory raw features to Fisher Vectors
- Raw features are piped to the Fisher Vector generator, eliminating the need to store the large raw features
- State-of-the-art performance for large action/event video datasets
- Rudimentary baseline experiment implemented in baseline.py
Tunable parameters:
- Number of modes used to construct Gaussian mixture models (120)
- Option to reduce dimensions of the 5 IDTF descriptors using PCA (500)
- Choice of classification model
-
Improved Dense Trajectories using OpenCV 2.4.13.
In addition, the OpenCV 2.4.13 should be compiled by hand.
How to install OpenCV2.4.13 -
Yael library (v438): to compute the Fisher Vectors and GMMs.
Yael Intro
How to install Yael -
Scikit-learn (0.20.0): for classification models.
Sklearn Intro -
ffmpeg would be isntalled. (version 2.8.15)
sudo apt-get install fmpeg
pipeline.sh illustrates the list of commands required to execute the pipeline. Before this pipeline, please modify some file paths in the code:
- classify_experiment.py: training_output, testing_output
- classify_library.py: class_index_file, training_output, testing_output
- computeFVstream.py: sys.path.append() for yael lib in line 4
- computeIDTF.py: ucf101_path, dtBin, tmpDir
- gmm.py: GMM_dir, sys.path.append() for yael lib in line 9
- pipeline.sh: all paths should be changed.
- train_classifier.py: class_index_file, training_output, testing_output
Description of files:
-
computeIDTF.py:
- Resize the videos using ffmpeg.py file to 320x240.
- Execute the IDTF binary, which is hard-coded in the file.
- The IDTF binary can be downloaded.
- saves temporary resized files in ./tmp directory
-
ffmpeg.py
- Python wrapper for basic ffmpeg operations, and used to resize the videos.
-
gmm.py
-
Computes the gmms.
-
Calls the computeIDTF module to generate the .feature video files that will be used to create GMMs These .feature files will be stored in the ./GMM_IDTFs directory. Each .feature file is on the order of 200MB.
-
Computes GMM for each of the 5 IDTF descriptor types (trajs, hogs, hofs, mbhxs, mbhys). Saves each of the GMMs in the specified file using python temporaryFile.
-
-
IDT_feature.py
-
Handles a improved trajectory feature point The .feature files contain IDT features for each video. A single IDTF contains a few different pieces of information as described in http://lear.inrialpes.fr/people/wang/improved_trajectories
-
Most importantly, each IDTF contains each of the following descriptors.
- trajectory
- hog
- hof
- mbhx
- mbhy
-
-
computeFVs.py
- extract IDTFs and compute the Fisher Vectors (FVs) for each of the videos in the input list (vid_in).
- The training and testing file splits are included in text files.
- The Fisher Vectors are output in the ./UCF101_Fishers directory.
-
ThreadPool.py
- Python module I used to parallelize the computeFVs script.
-
computeFVstream
- Recieves stdin stream of IDTFs generated by the IDTF binary.
- Converts an IDTF from a video into a fisher vector.
- Prereq: GMM must have already been computed.
-
computeFV.py
- Encodes a fisher vector. Requires IDTFs as input
-
compute_UC101_class_index.py
- Builds a dictionary of UCF101 class names to class index.
-
classify_library.py
- Custom library of methods useful when optimizing and working with the classifiers.
-
classify_experiment.py
- Script used to experiment with different settings for hyperparameters for SVM classifiers.
-
train_classify.py
- Python script to optimize the choice of hyperparameters for SVM classifiers.
- Outputs the results in the GridSearch_ouput file.
-
baseline.py (have not tried yet)
- Implements the baseline experiment.
- The baseline result involves extracting the first frame from each video and using this single image as the content to perform classification.
- The first frame for every video is resized to 240 x 320.
- PCA is performed on each image in order to reduce the dimensions.
- A multiclass linear SVM is used to classify the videos.
-
improved_trajectory_release/ (dir)
- Extract IDT features from videos, written in C++.
- Before running, it should be compiled by using make commend.
-
Khurram Soomro, Amir Roshan Zamir and Mubarak Shah, UCF101: A Dataset of 101 Human Action Classes From Videos in The Wild., CRCV-TR-12-01, November, 2012.
-
Many thanks to the open project at github https://github.com/anenbergb/CS221_Project