BERT预训练

Stevezhangz

于 2022-05-30 09:35:12 发布

阅读量349

点赞数

CC 4.0 BY-SA版权

文章标签： bert pytorch 自然语言处理

本文链接：https://blog.csdn.net/captainAAAjohn/article/details/125040780

该博客主要介绍重新复现BERT，给出了项目地址https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH ，涉及自然语言处理领域，使用了PyTorch进行复现操作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

重新复现一下bert

项目地址: $https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH \url{https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH}$

操作方法

git clone https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH.git
cd BERT_PRETRAIN_PYTORCH
mkdir dataset
# move txt files to 'dataset' such as dataset/oscar.eo.txt
pip install -r requirements.txt
python create_sentence_pairs.py
python train_tokenizer.py
python main.py