[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
-
Updated
Nov 17, 2024 - Python
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Video embeddings for retrieval with natural language queries
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Authors official PyTorch implementation of the "ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning" [ICCV 2019]
Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
A PyTorch implementation of VIOLET
[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Authors official Tensorflow implementation of the "Near-Duplicate Video Retrieval with Deep Metric Learning" [ICCVW 2017]
[CVPR 2023 Highlight] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)
Authors official PyTorch implementation of the "DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval" [IJCV 2022]
TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]
[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
Add a description, image, and links to the video-retrieval topic page so that developers can more easily learn about it.
To associate your repository with the video-retrieval topic, visit your repo's landing page and select "manage topics."