Representation Learning 11
- [논문 리뷰] Geodesic Multi-Modal Mixup for Robust Fine-Tuning
- [논문 리뷰] Matryoshka Representation Learning
- [논문 리뷰] Discovering Clone Negatives via Adaptive Contrastive Learning for Image-Text Matching
- [논문 리뷰] Effective post-training embedding compression via temperature control in contrastive training
- [논문 리뷰] Weighted Point Cloud Embedding for Multi-modal Contrastive Learning Toward Optimal Similarity Metric
- [논문 리뷰] TULIP: Token-length Upgraded CLIP
- [논문 리뷰] Expertized Caption Auto-Enhancement for Video-Text Retrieval
- [논문 리뷰] Unified Lexical Representation for Interpretable Visual-Language Alignment
- [논문 리뷰] Are Diffusion Models Vision-And-Language Reasoners?
- [논문 리뷰] Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval
- [논문 리뷰] ATM: Action Temporality Modeling for Video Question Answering