重新复现一下bert
项目地址: https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH \url{https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH} https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH
操作方法
git clone https://github.com/stevezhangz/BERT_PRETRAIN_PYTORCH.git
cd BERT_PRETRAIN_PYTORCH
mkdir dataset
# move txt files to 'dataset' such as dataset/oscar.eo.txt
pip install -r requirements.txt
python create_sentence_pairs.py
python train_tokenizer.py
python main.py