向沐神学习笔记：GPT，GPT-2，GPT-3 论文精读【论文精读】GPT部分

本文链接：https://blog.csdn.net/buyaotutou/article/details/141816991

系列文章目录

例如：

文章目录

系列文章目录
一、GPT
- 1、Abstract
二、
- 1、
- 2、
- 3、
三、
- 1、
- 2、
- 3、
四、
- 1、
- 2、
- 3、
五、
- 1、
- 2、
- 3、
六、
- 1、
- 2、
- 3、
七、
- 1、
- 2、
- 3、
八、
- 1、
- 2、
- 3、

一、GPT

同样模型大小，比如一个亿模型大小的时候，bert的性能表现优于gpt，也就是未来的工作更愿意用bert这篇文章，因为我咬咬牙还能跑起来，但是gpt的实验实在跑不起来。

1、Abstract

Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these speciﬁc tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative ﬁne-tuning on each speciﬁc task. In contrast to previous approaches, we make use of task-aware input transformations during ﬁne-tuning to achieve