Language Models are Few-shot Learners (GPT-3)

Summary: A video description of the paper entitled "Language Models are Few-shot Learners" by T. Brown et al. published at NeurIPS in 2020.
Paper: The paper can be found on arxiv here.
Topics: language models, foundation models, GPT-3, scaling
Slides: link (pdf)

References

S. Carey et al., "Acquiring a single new word", ERIC (1978)
M. Marcus et al., "The Penn treebank: Annotating predicate argument structure", HLT Workshop (1994)
S. Hochreiter et al., "Learning to learn using gradient descent", ICANN (2001)
E. Loper et al., "NLTK: The natural language toolkit", arxiv (2002)
P. Turney et al., "Combining independent modules to solve multiple-choice synonym and analogy problems", arxiv (2003)
P. Turney et al., "Corpus-based learning of analogies and semantic relations", Machine Learning (2005)
P. Norvig, "Natural language corpus data", Beautiful data (2009)
S. Baccianella et al., "Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining", LREC (2010)
M. Roemmele et al., "Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning", AAAI symposium (2011)
R. Ross, "Guide for Conducting Risk Assessments", Special Publication NIST SP (2012)
H. Levesque et al., "The Winograd schema challenge", KR (2012)
T. Mikolov et al., "Efficient estimation of word representations in vector space", arxiv (2013)
J. Berant et al., "Semantic parsing on freebase from question-answer pairs", EMNLP (2013)
J. Pennington et al., "Glove: Global vectors for word representation", EMNLP (2014)
N. Durrani et al., "Edinburgh’s phrase-based machine translation systems for WMT-14", WMT (2014)
A. Dai et al., "Semi-supervised sequence learning", NeurIPS (2015)
D. P. Kingma et al., "Adam: A method for stochastic optimization", ICLR (2015)
R. Sennrich et al., "Improving neural machine translation models with monolingual data", (2015)
G. Hinton et al., "Distilling the knowledge in a neural network", arxiv (2015)
O. Vinyals et al., "Matching networks for one shot learning", NeurIPS (2016)
K. He et al., "Identity mappings in deep residual networks", ECCV (2016)
D. Paperno et al., "The LAMBADA dataset: Word prediction requiring a broad discourse context", ACL (2016)
J. Ba, "Layer Normalization", arxiv (2016)
N. Mostafazadeh et al., "A corpus and cloze evaluation for deeper understanding of commonsense stories", NAACL HLT (2016)
A. Vaswani et al., "Attention is all you need", NeurIPS (2017)
G. Lai et al., "RACE: Large-scale ReAding Comprehension Dataset From Examinations", EMNLP (2017)
M. Joshi et al., "TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension", ACL (2017)
I. Loshchilov et al., "Decoupled weight decay regularization", arxiv (2017)
K. Crawford, "The trouble with bias", NeurIPS (2017)
D. Amodei et al., "AI and Compute", https://openai.com/blog/ai-and-compute/ (2018)
S. Gururangan et al., "Annotation artifacts in natural language inference data", arxiv (2018)
A. Radford et al. "Improving language understanding by generative pre-training" (2018)
E. Choi et al., "QuAC: Question Answering in Context", EMNLP (2018)
S. McCandlish et al., "An empirical model of large-batch training", arxiv (2018)
P. Clark et al., "Think you have solved question answering? try arc, the ai2 reasoning challenge" arxiv (2018)
T. Mihaylov et al., "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering", EMNLP (2018)
S. Edunov et al., "Understanding Back-Translation at Scale", EMNLP (2018)
T. Trinh et al., "A simple method for commonsense reasoning", arxiv (2018)
P. Rajpurkar et al., "Know What You Don’t Know: Unanswerable Questions for SQuAD", ACL (2018)
S. Zhang et al., "Record: Bridging the gap between human and machine commonsense reading comprehension", arxiv (2018)
A. Wang et al., "GLUE: A multi-task benchmark and analysis platform for natural language understanding", ICLR (2018)
D. Khashabi et al., "Looking beyond the surface: A challenge set for reading comprehension over multiple sentences", NAACL-HLT (2018)
R. Rudinger et al., "Gender bias in coreference resolution", arxiv (2018)
M. Mitchell et al., "Model cards for model reporting", FAccT (2018)
Y. Qian et al., "Reducing gender bias in word-level language models with a gender-equalizing loss function", arxiv (2019)
B. McCann et al., "The natural language decathlon: Multitask learning as question answering", arxiv (2018)
A. Radford et al., "Language models are unsupervised multitask learners", OpenAI (2019)
J. Devlin et al., "Bert: Pre-training of deep bidirectional transformers for language understanding", NAACL-HLT (2019)
T. Kwiatkowski et al., "Natural questions: a benchmark for question answering research", ACL (2019)
M. Shoeybi et al., "Megatron-lm: Training multi-billion parameter language models using model parallelism", arxiv (2019)
S. Reddy et al. "CoQA: A conversational question answering challenge", ACL (2019)
R. Child et al., "Generating long sequences with sparse transformers", arxiv (2019)
A. Wang et al., "SuperGLUE: A stickier benchmark for general-purpose language understanding systems", NeurIPS (2019)
R. Zellers et al., "HellaSwag: Can a Machine Really Finish Your Sentence?" ACL (2019)
Y. Liu et al., "RoBERTa: A robustly optimized bert pretraining approach", arxiv (2019)
Z. Li, "Story ending prediction by transferable BERT", arxiv (2019)
Z. Lan et al., "ALBERT: A lite BERT for self-supervised learning of language representations", arxiv (2019)
Y. Wang et al., "Multi-agent dual learning", ICLR (2019)
K. Song et al., "MASS: Masked sequence to sequence pre-training for language generation", ICML (2019)
A. Conneau et al., "Cross-lingual language model pretraining", NeurIPS (2019)
Y. Ju, et al. "Technical report on conversational question answering", arxiv (2019)
D. Dua et al., "DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs", NAACL-HLT (2019)
M. Pilehvar, "WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations", NAACL-HLT (2019)
C. Clark et al., "BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions", NAACL-HLT (2019)
M-C. De Marneffe et al., "The commitmentbank: Investigating projection in naturally occurring discourse", Sinn und Bedeutung (2019)
R. Zellers et al., "Defending against neural fake news", NeurIPS (2019)
D. Ippolito et al., "Automatic detection of generated text is easiest when humans are fooled", ACL (2019)
S. Gehrmann et al. "GLTR: Statistical detection and visualization of generated text", ACL (2019)
A. Holtzman et al., "The Curious Case of Neural Text Degeneration", ICLR (2019)
X. Liu et al., "Improving multi-task deep neural networks via knowledge distillation for natural language understanding", arxiv (2019)
I. Solaiman et al., "Release strategies and the social impacts of language models", arxiv (2019)
P-S. Huang et al., "Reducing Sentiment Bias in Language Models via Counterfactual Evaluation", EMNLP (2020)
D. Hernandez et al., "Measuring the algorithmic efficiency of neural networks", arxiv (2020)
R. Schwartz et al., "Green AI", Communications of the ACM (2020)
C. Raffel et al., "Exploring the limits of transfer learning with a unified text-to-text transformer", JMLR (2020)
T. Brown et al., "Language models are few-shot learners", NeurIPS (2020)
J. Kaplan et al., "Scaling laws for neural language models", arxiv (2020)
Y. Nie et al. "Adversarial NLI: A new benchmark for natural language understanding", ACL (2020)
Y. Bisk et al., "PiQA: Reasoning about physical commonsense in natural language", AAAI (2020)
Y. Bisk et al. "Experience grounds language", arxiv (2020)
X. Liu et al., "Adversarial training for large neural language models", arxiv (2020)
A. Roberts, "How much knowledge can you pack into the parameters of a language model?", arxiv (2020)
P. Lewis et al., "Retrieval-augmented generation for knowledge-intensive nlp tasks", NeurIPS (2020)
Y. Liu et al., "Multilingual denoising pre-training for neural machine translation", ACL (2020)
S-C. Lin et al., "Tttttackling winogrande schemas", arxiv (2020)
D. Khashabi et al. "UnifiedQA: Crossing Format Boundaries with a Single QA System", EMNLP (2020)
J. Zheng, "Numeric Transformer - ALBERT", AI2 leaderboard (2020)
K. Guu et al,. "REALM: Retrieval-Augmented Language Model Pretraining" arxiv (2020)
P-S. Huang et al., "Reducing Sentiment Bias in Language Models via Counterfactual Evaluation", EMNLP (2020)
K. Sakaguchi et al., "Winogrande: An adversarial winograd schema challenge at scale", Communications of the ACM (2021)
A. Radford et al., "Learning transferable visual models from natural language supervision", ICML (2021)
S. Kreps et al., "All the news that’s fit to fabricate: AI-generated text as a tool of media misinformation", JEPS (2022)

Samuel Albanie