Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
-
Updated
Oct 26, 2024 - Python
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"
Emotional Video to Audio Transformation with ANFIS-DeepRNN (Vanilla RNN and LSTM-DeepRNN) [MPE 2020]
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
Recommends Apparel based on Text, Visual features, and weighted similarity using brand and color similarity.
Stitching and fusion of on-board surround view BEV real world image sequences, odometer estimation and output of large pixel map
Stitching and fusion of on-board surround view BEV real world image sequences, odometer estimation and output of large pixel map
Add a description, image, and links to the visual-features topic page so that developers can more easily learn about it.
To associate your repository with the visual-features topic, visit your repo's landing page and select "manage topics."