Text-Video 5
- [논문 리뷰] Geodesic Multi-Modal Mixup for Robust Fine-Tuning
- [논문 리뷰] DrVideo: Document Retrieval Based Long Video Understanding
- [논문 리뷰] Expertized Caption Auto-Enhancement for Video-Text Retrieval
- [논문 리뷰] Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
- [논문 리뷰] ATM: Action Temporality Modeling for Video Question Answering