On the Opportunities and Risks of Foundation Models (introduction)


Summary: A video description of the ideas covered in the introductory section of the work "On the Opportunities and Risks of Foundation Models" by R. Bommasani et al. published on arxiv in 2021.
Paper: The paper can be found on arxiv here.
Topics: foundation models, emergence, homogenisation
Slides: link (pdf)

References
  • R. K. Merton, "The Normative Structure of Science" (1942)
  • A. M. Turing, “Intelligent Machinery" (1948)
  • A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers", IBM Journal of R&D (1959)
  • P. Anderson, "More is different: broken symmetry and the nature of the hierarchical structure of science", Science (1972)
  • S. Bozinovski et al., "The influence of pattern similarity and transfer of learning upon training of a base perceptron B2" (original in Croatian, 1976)
  • V. R. de Sa, “Learning Classification with Unlabeled Data”, NeurIPS (1993)
  • D. Lowe, "Object recognition from local scale-invariant features" ICCV (1999)
  • C. Kerr, "The uses of the university", Harvard University Press (2001)
  • L. Smith and M. Gasser, "The development of embodied cognition: Six lessons from babies", Artificial life (2005)
  • A. Beberg et al., "Folding@ home: Lessons from eight years of volunteer distributed computing" (2009)
  • J. Turian, "Word representations: a simple and general method for semi-supervised learning", ACL (2010)
  • A. Krizhevsky et al., "Imagenet classification with deep convolutional neural networks", NeurIPS (2012)
  • T. Mikolov et al. “Efficient Estimation of Word Representations in Vector Space”, ICLR (2013)
  • J. Pennington et al., "Glove: Global vectors for word representation", EMLNP (2014)
  • J. Schmidhuber, "Deep learning in neural networks: An overview", Neural networks (2015)
  • Y. LeCun et al., "Deep learning", Nature (2015)
  • A. Dai et al., "Semi-supervised sequence learning", NeurIPS (2015)
  • O. Russakovsky et al., "Imagenet large scale visual recognition challenge", IJCV (2015)
  • K. Radinsky, "Data monopolists like Google are threatening the economy", Harvard Business Review (2015)
  • D. Silver et al., "Mastering the game of Go with deep neural networks and tree search", Nature (2016)
  • M. Abadi et al., "{TensorFlow}: a system for {Large-Scale} machine learning", OSDI (2016)
  • A. Vaswani et al., "Attention is all you need", NeurIPS (2017)
  • A. Radford et al., "Improving language understanding by generative pre-training", (2018)
  • J. Howard et al., “Universal Language Model Fine-tuning for Text Classification”, ACL (2018)
  • M. Peters et al,. “Deep Contextualized Word Representations”, NAACL (2018)
  • A. Paszke et al., "Pytorch: An imperative style, high-performance deep learning library", NeurIPS (2019)
  • A. Radford, “Language Models are Unsupervised Multitask Learners”, (2019)
  • J. Devlin et al., "Bert: Pre-training of deep bidirectional transformers for language understanding", NAACL-HLT (2019)
  • Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach”, arxiv (2019)
  • C. Raffel, et al., "Exploring the limits of transfer learning with a unified text-to-text transformer", JMLR (2019)
  • M. Mitchell et al., "Model cards for model reporting", FAccT (2019)
  • T. Brown et al., "Language models are few-shot learners", NeurIPS (2020)
  • J. Kaplan et al., "Scaling laws for neural language models", arxiv (2020)
  • P. Yin et al., "TaBERT: Pretraining for joint understanding of textual and tabular data", arxiv (2020)
  • A. T. Liu et al. “Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders”, ICASSP (2020)
  • M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension”, ACL (2020)
  • T. Chen et al., "A simple framework for contrastive learning of visual representations", ICML (2020)
  • M. Ryabinin et al., "Towards crowdsourced training of large neural networks using decentralized mixture-of-experts", NeurIPS (2020)
  • D. Rothchild et al., "C5t5: Controllable generation of organic molecules with transformers", arxiv (2021)
  • A. Rives et al., "Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences", PNAS (2021)
  • A. Radford et al., "Learning transferable visual models from natural language supervision", ICML (2021)
  • L. Chen et al., "Decision transformer: Reinforcement learning via sequence modeling", NeurIPS (2021)
  • A. Dosovitskiy, et al. "An image is worth 16x16 words: Transformers for image recognition at scale", ICLR (2021)
  • A. Ramesh et al., "Zero-shot text-to-image generation", ICML (2021)
  • R. Bommasani et al., "On the opportunities and risks of foundation models", arxiv (2021)
  • M. Chen et al., "Evaluating large language models trained on code", arxiv (2021)
  • R. Reich et al., "System error: Where big tech went wrong and how we can reboot", Hodder & Stoughton (2021)
  • C. Ré, The Road to Software 2.0 or Data‑Centric AI https://hazyresearch.stanford.edu/data-centric-ai (2021)
  • S. L. Blodgett and M. Madaio., "Risks of AI foundation models in education", arxiv (2021)
  • J. Malik, https://crfm.stanford.edu/commentary/2021/10/18/malik.html (2021)
  • G. Marcus ad E. Davis, https://crfm.stanford.edu/commentary/2021/10/18/marcus-davis.html (2021)
  • E. M. Bender et al., "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?", FAccT (2021)
  • D. Hendrycks et al., "Unsolved problems in ml safety", arxiv (2021)
  • G. Sastry, https://crfm.stanford.edu/commentary/2021/10/18/sastry.html (2021)
  • J. Steinhardt, https://crfm.stanford.edu/commentary/2021/10/18/steinhardt.html (2021)
  • J. Steinhardt, https://bounded-regret.ghost.io/ai-forecasting/ (2021)