Skip to content

Latest commit

 

History

History
120 lines (96 loc) · 5.3 KB

README.md

File metadata and controls

120 lines (96 loc) · 5.3 KB

Bukva: Russian Sign Language Alphabet Dataset

We introduce a video dataset Bukva for Russian Dactyl Recognition task. Bukva dataset size is about 27 GB, and it contains 3757 RGB videos with more than 101 samples for each RSL alphabet sign, including dynamic ones. The dataset is divided into training set and test set by subject user_id. The training set includes 3097 videos, and the test set includes 660 videos. The total video recording time is ~4 hours. About 17% of the videos are recorded in HD format, and 70% of the videos are in FullHD resolution.

gif

Downloads

Downloads Size (GB) Comment
dataset ~27 Original HD+, Trimmed HD+, annotations

Annotation file is easy to use and contains some useful columns, see annotations.tsv file:

attachment_id user_id text begin end height width train length
0 df5b08f0-... 18... А 36 76 1920 1080 False 150
1 3d2b6a08-... 9a... А 31 63 1920 1080 True 78
2 1915f996-... ca... А 25 81 1920 1080 True 98

where:

  • attachment_id - video file name
  • user_id - unique anonymized user ID
  • text - gesture class in Russian Langauge
  • begin - start of the gesture (for original dataset)
  • end - end of the gesture (for original dataset)
  • height - video height
  • width - video width
  • train - train or test boolean flag
  • length - video length

After downloading, you can unzip the archive by running the following command:

unzip <PATH_TO_ARCHIVE> -d <PATH_TO_SAVE>

The structure of the dataset is as follows:

├── original
│   ├── 0a1b79d6-...
│   ├── 0a53c65e-...
│   ├── ...
├── trimmed
│   ├── 0a1b79d6-...
│   ├── 0a53c65e-...
│   ├── ...
├── annotations.tsv

Models

We provide some pre-trained models as the baseline for Russian Dactyl Recognition.

Model Name Model Size (MB) Metric ONNX
MobileNetV2_TSM 9.1 83.6 weights

Training

To train models from scratch you need to follow the instructions below.

  1. Download dataset using link from section Download
  2. Convert annotations to txt format using constants.py
    <path_to_video> <class_id>
    <path_to_video> <class_id>
    ...
    
  3. Using mmaction2 framework to train models, prepare the environment.
  4. Add the path to your train and test txt files to the training_pipeline_tsm.py config.
  5. Choose model config from the configs folder and start training.

Demo

usage: demo.py [-h] -p CONFIG [--mp] [-v] [-l LENGTH]

optional arguments:
  -h, --help            show this help message and exit
  -p CONFIG, --config CONFIG
                        Path to config
  --mp                  Enable multiprocessing
  -v, --verbose         Enable logging
  -l LENGTH, --length LENGTH
                        Deque length for predictions


python demo.py -p <PATH_TO_CONFIG>

Dataset example

image

License

Creative Commons License
This work is licensed under a variant of Creative Commons Attribution-ShareAlike 4.0 International License.

Please see the specific license.

Authors and Credits

Links

Citation

You can cite the paper using the following BibTeX entry:

@misc{kvanchiani2024bukvarussiansignlanguage,
  title={Bukva: Russian Sign Language Alphabet},
  author={Karina Kvanchiani and Petr Surovtsev and Alexander Nagaev and Elizaveta Petrova and Alexander Kapitanov},
  year={2024},
  eprint={2410.08675},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2410.08675},

}