Machine Learning
-
[논문 리뷰] ALBEF - Align before Fuse: Vision and Language Representation Learning with Momentum Distillation (NeurIPS 2021, Spotlight)Machine Learning/Multimodal Learning 2022. 4. 20. 21:31
NeurIPS 2021의 spotlight 논문으로, Vision-Language Pre-training(VLP) domain에서 multimodal encoder 앞단에 pre-alignment part를 추가한 새로운 framework를 제안했습니다. 당시 다양한 VL task(IRTR, VQA, NLVR2 등)에서 SOTA를 달성했고, 이후에 CVPR 2022, ICML 2022에서도 ALBEF를 기반으로 한 논문이 많이 제출되었습니다. [ Paper / Code ] BERT, ViT, CLIP, Knowledge distillation(KD), VLP domain에 대한 이해를 전제로 review를 작성했습니다. 1) Abstract & Introduction 최근 다양한 vision-langua..
-
[CS231n] 1. Image ClassificationMachine Learning/CS231n 2021. 4. 28. 00:08
Keywords : Data-driven Approach, K-Nearest Neighbor, train/validation/test splits L1,L2 distances, hyperparameter search, cross-validation 1. Image Classification The task of assigning an input image one label from a fixed set of categories One of the core problems in Computer Vision 1) Example A single image and assigns probabilities to 4 labels, {cat, dog, hat, mug} The cat image is 248 pixels..