Skip to content

A PyTorch implementation of ACRNet based on ICME 2023 paper "Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network"

Notifications You must be signed in to change notification settings

leftthomas/ACRNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ACRNet

A PyTorch implementation of ACRNet based on ICME 2023 paper Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network.

Network Architecture

Requirements

conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install openmim
mim install mmaction2 -f https://github.com/open-mmlab/mmaction2.git

Datasets

THUMOS 14 and ActivityNet datasets are used in this repo, you should download these datasets from official websites. The RGB and Flow features of these datasets are extracted by dataset.py with 25 FPS. You should follow this link to install OpenCV4 with CUDA. And then compile denseFlow_GPU, put the executable program in this dir. The options could be found in dataset.py, this script will take a lot of time to extract the features. Finally, I3D features of these datasets are extracted by this repo, the extract_features.py file should be replaced with extract.py, the options could be found in extract.py. To make this research friendly, we uploaded these I3D features in MEGA. You could download them from there, and make sure the data directory structure is organized as follows:

├── thumos14                                    |  ├── activitynet
  ├── features                                  |   ├── features
      ├── val                                   |       ├── training 
          ├── video_validation_0000051_flow.npy |           ├── v___c8enCfzqw_flow.npy
          ├── video_validation_0000051_rgb.npy  |           ├── v___c8enCfzqw_rgb.npy
          └── ...                               |           └── ...                           
      ├── test                                  |       ├── validation                 
          ├── video_test_0000004_flow.npy       |           ├── v__1vYKA7mNLI_flow.npy  
          ├── video_test_0000004_rgb.npy        |           ├── v__1vYKA7mNLI_rgb.npy 
          └── ...                               |           └── ...     
  ├── videos                                    |   ├── videos  
      ├── val                                   |       ├── training      
          ├── video_validation_0000051.mp4      |           ├── v___c8enCfzqw.mp4
          └──...                                |           └──...        
      ├── test                                  |       ├── validation           
          ├── video_test_0000004.mp4            |           ├── v__1vYKA7mNLI.mp4
          └──...                                |           └──...      
  annotations.json                              |    annotations_1.2.json, annotations_1.3.json

Usage

You can easily train and test the model by running the script below. If you want to try other options, please refer to utils.py.

Train Model

python main.py --data_name activitynet1.2 --num_segments 80 --seed 42

Test Model

python main.py --data_name thumos14 --model_file result/thumos14.pth

Benchmarks

The models are trained on one NVIDIA GeForce RTX 3090 GPU (24G). seed is 42 for all datasets, num_seg is 80, alpha is 0.8 and batch_size is 128 for both activitynet1.2&1.3 datasets, the other hyper-parameters are the default values.

THUMOS14

Method THUMOS14 Download
[email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] mAP@AVG
ACRNet 76.7 70.7 61.0 49.0 37.0 24.8 13.4 47.5 MEGA

mAP@AVG is the average mAP under the thresholds [0.1:0.1:0.7].

ActivityNet

Method ActivityNet 1.2 ActivityNet 1.3 Download
[email protected] [email protected] [email protected] mAP@AVG [email protected] [email protected] [email protected] mAP@AVG
ACRNet 46.2 28.4 5.7 28.4 40.9 26.0 5.4 25.7 MEGA

mAP@AVG is the average mAP under the thresholds [0.5:0.05:0.95].

Results

vis

About

A PyTorch implementation of ACRNet based on ICME 2023 paper "Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network"

Topics

Resources

Stars

Watchers

Forks

Languages