아카이브
- 30 / 04 [논문 리뷰] Geodesic Multi-Modal Mixup for Robust Fine-Tuning
- 01 / 04 [논문 리뷰] What to Align in Multimodal Contrastive Learning
- 01 / 04 [논문 리뷰] Train Once, Deploy Anywhere: Matryoshka Representation Learning for Multimodal Recommendation
- 25 / 03 [논문 리뷰] Matryoshka Representation Learning
- 24 / 03 [논문 리뷰] DrVideo: Document Retrieval Based Long Video Understanding
- 14 / 03 [논문 리뷰] Discovering Clone Negatives via Adaptive Contrastive Learning for Image-Text Matching
- 09 / 03 [논문 리뷰] Effective post-training embedding compression via temperature control in contrastive training
- 06 / 03 [논문 리뷰] VkD : Improving Knowledge Distillation using Orthogonal Projections
- 03 / 03 [논문 리뷰] Weighted Point Cloud Embedding for Multi-modal Contrastive Learning Toward Optimal Similarity Metric
- 25 / 02 [논문 리뷰] Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
- 17 / 02 [논문 리뷰] TULIP: Token-length Upgraded CLIP
- 10 / 02 [논문 리뷰] DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
- 09 / 02 [논문 리뷰] Expertized Caption Auto-Enhancement for Video-Text Retrieval
- 04 / 02 [논문 리뷰] Unified Lexical Representation for Interpretable Visual-Language Alignment
- 20 / 01 [논문 리뷰] CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
- 17 / 01 [논문 리뷰] Decoupled Knowledge Distillation
- 13 / 01 [논문 리뷰] Are Diffusion Models Vision-And-Language Reasoners?