Self-supervised Vision



Summary: A lecture covering the basics of self-supervised learning in computer vision.
Topics: Self-supervised learning
Slides: link (pdf)


References/links
  • L. B. Smith and M. Gasser, “The Development of Embodied Cognition: Six Lessons from Babies,” Artificial Life (2005)
  • L. B. Smith et al., “The Developing Infant Creates a Curriculum for Statistical Learning”, Trends in Cognitive Sciences (2018)
  • A. M. Turing, “Intelligent Machinery", (1948)
  • H. L. F. Helmholtz, “The Facts in Perception” (1878)
  • H. B. Barlow, "Unsupervised learning", Neural computation (1989)
  • V. R. de Sa, “Learning Classification with Unlabeled Data”, NeurIPS (1993)
  • J. Schmidhuber and S. Heil, “Sequential neural text compression”, IEEE Trans. on Neural Networks (1996)
  • Y. Bengio et al., “A Neural Probabilistic Language Model”, JMLR (2000)
  • T. Mikolov et al. “Efficient Estimation of Word Representations in Vector Space”, ICLR (2013)
  • J. Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, NAACL (2019)
  • C. Doersch et al., “Unsupervised Visual Representation Learning by Context Prediction”, ICCV (2015)
  • D. Pathak et al., “Context Encoders: Feature Learning by Inpainting”, CVPR (2016)
  • M. Noroozi and P. Favaro, “Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles”, ECCV (2016)
  • R. Zhang et al., “Colorful Image Colorization”, ECCV (2016)
  • M. Mathieu et al., "Deep multi-scale video prediction beyond mean square error", arXiv preprint arXiv:1511.05440 (2015)
  • M. Noroozi et al., “Representation Learning by Learning to Count”, ICCV (2017)
  • A. Mahendran et al., “Cross Pixel Optical Flow Similarity for Self-Supervised Learning”, ACCV (2018)
  • S. Gidaris et al. “Unsupervised Representation Learning by Predicting Image Rotations”, ICLR (2018)
  • M. Caron et al., “Deep Clustering for Unsupervised Learning of Visual Features”, ECCV (2018)
  • T. Chen et al., "A simple framework for contrastive learning of visual representations", ICML (2020)
  • K. He et al., "Masked autoencoders are scalable vision learners", CVPR (2022)