Skip to content

kaiqiangh/IDT_Fisher_Vector

Repository files navigation

Improved Dense Trajectory with Fisher Vector

The project implements an activity recognition system trained on the UCF101 dataset (30 classes involved).

Keypoints

  1. A full pipeline from Dense Trajectory raw features to Fisher Vectors
  2. Raw features are piped to the Fisher Vector generator, eliminating the need to store the large raw features
  3. State-of-the-art performance for large action/event video datasets
  4. Rudimentary baseline experiment implemented in baseline.py

Tunable parameters:

  1. Number of modes used to construct Gaussian mixture models (120)
  2. Option to reduce dimensions of the 5 IDTF descriptors using PCA (500)
  3. Choice of classification model

Requirement

  1. Improved Dense Trajectories using OpenCV 2.4.13.
    In addition, the OpenCV 2.4.13 should be compiled by hand.
    How to install OpenCV2.4.13

  2. Yael library (v438): to compute the Fisher Vectors and GMMs.
    Yael Intro
    How to install Yael

  3. Scikit-learn (0.20.0): for classification models.
    Sklearn Intro

  4. ffmpeg would be isntalled. (version 2.8.15)
    sudo apt-get install fmpeg

  5. UCF101 dataset

Descriptions

pipeline.sh illustrates the list of commands required to execute the pipeline. Before this pipeline, please modify some file paths in the code:

  1. classify_experiment.py: training_output, testing_output
  2. classify_library.py: class_index_file, training_output, testing_output
  3. computeFVstream.py: sys.path.append() for yael lib in line 4
  4. computeIDTF.py: ucf101_path, dtBin, tmpDir
  5. gmm.py: GMM_dir, sys.path.append() for yael lib in line 9
  6. pipeline.sh: all paths should be changed.
  7. train_classifier.py: class_index_file, training_output, testing_output

Description of files:

  1. computeIDTF.py:

    1. Resize the videos using ffmpeg.py file to 320x240.
    2. Execute the IDTF binary, which is hard-coded in the file.
    3. The IDTF binary can be downloaded.
    4. saves temporary resized files in ./tmp directory
  2. ffmpeg.py

    1. Python wrapper for basic ffmpeg operations, and used to resize the videos.
  3. gmm.py

    1. Computes the gmms.

    2. Calls the computeIDTF module to generate the .feature video files that will be used to create GMMs These .feature files will be stored in the ./GMM_IDTFs directory. Each .feature file is on the order of 200MB.

    3. Computes GMM for each of the 5 IDTF descriptor types (trajs, hogs, hofs, mbhxs, mbhys). Saves each of the GMMs in the specified file using python temporaryFile.

  4. IDT_feature.py

    1. Handles a improved trajectory feature point The .feature files contain IDT features for each video. A single IDTF contains a few different pieces of information as described in http://lear.inrialpes.fr/people/wang/improved_trajectories

    2. Most importantly, each IDTF contains each of the following descriptors.

      1. trajectory
      2. hog
      3. hof
      4. mbhx
      5. mbhy
  5. computeFVs.py

    1. extract IDTFs and compute the Fisher Vectors (FVs) for each of the videos in the input list (vid_in).
    2. The training and testing file splits are included in text files.
    3. The Fisher Vectors are output in the ./UCF101_Fishers directory.
  6. ThreadPool.py

    1. Python module I used to parallelize the computeFVs script.
  7. computeFVstream

    1. Recieves stdin stream of IDTFs generated by the IDTF binary.
    2. Converts an IDTF from a video into a fisher vector.
    3. Prereq: GMM must have already been computed.
  8. computeFV.py

    1. Encodes a fisher vector. Requires IDTFs as input
  9. compute_UC101_class_index.py

    1. Builds a dictionary of UCF101 class names to class index.
  10. classify_library.py

    1. Custom library of methods useful when optimizing and working with the classifiers.
  11. classify_experiment.py

    1. Script used to experiment with different settings for hyperparameters for SVM classifiers.
  12. train_classify.py

    1. Python script to optimize the choice of hyperparameters for SVM classifiers.
    2. Outputs the results in the GridSearch_ouput file.
  13. baseline.py (have not tried yet)

    1. Implements the baseline experiment.
    2. The baseline result involves extracting the first frame from each video and using this single image as the content to perform classification.
    3. The first frame for every video is resized to 240 x 320.
    4. PCA is performed on each image in order to reduce the dimensions.
    5. A multiclass linear SVM is used to classify the videos.
  14. improved_trajectory_release/ (dir)

    1. Extract IDT features from videos, written in C++.
    2. Before running, it should be compiled by using make commend.

Rerference

  1. Khurram Soomro, Amir Roshan Zamir and Mubarak Shah, UCF101: A Dataset of 101 Human Action Classes From Videos in The Wild., CRCV-TR-12-01, November, 2012.

  2. Many thanks to the open project at github https://github.com/anenbergb/CS221_Project

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published