All Projects → akanyaani → Gpt 2 Tensorflow2.0

akanyaani / Gpt 2 Tensorflow2.0

Licence: mit
OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Gpt 2 Tensorflow2.0

Onnxt5
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (-16.86%)
Mutual labels:  text-generation, transformer
amrlib
A python library that makes AMR parsing, generation and visualization simple.
Stars: ✭ 107 (-37.79%)
Mutual labels:  text-generation, transformer
Gpt2 Newstitle
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Stars: ✭ 235 (+36.63%)
Mutual labels:  text-generation, transformer
text-generation-transformer
text generation based on transformer
Stars: ✭ 36 (-79.07%)
Mutual labels:  text-generation, transformer
Gpt2 Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
Stars: ✭ 4,592 (+2569.77%)
Mutual labels:  text-generation, transformer
Dialogpt
Large-scale pretraining for dialogue
Stars: ✭ 1,177 (+584.3%)
Mutual labels:  text-generation, transformer
pytorch-transformer-chatbot
PyTorch v1.2에서 생긴 Transformer API 를 이용한 간단한 Chitchat 챗봇
Stars: ✭ 44 (-74.42%)
Mutual labels:  text-generation, transformer
Gpt2client
✍🏻 gpt2-client: Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, and 1.5B Transformer Models 🤖 📝
Stars: ✭ 322 (+87.21%)
Mutual labels:  text-generation, transformer
Transformer
A TensorFlow Implementation of the Transformer: Attention Is All You Need
Stars: ✭ 3,646 (+2019.77%)
Mutual labels:  implementation, transformer
Gpt2 French
GPT-2 French demo | Démo française de GPT-2
Stars: ✭ 47 (-72.67%)
Mutual labels:  text-generation, transformer
Gpt2 Chitchat
GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI思想)
Stars: ✭ 1,230 (+615.12%)
Mutual labels:  text-generation, transformer
Abcl
Armed Bear Common Lisp <git+https://github.com/armedbear/abcl/> <--> <svn+https://abcl.org/svn> Bridge
Stars: ✭ 151 (-12.21%)
Mutual labels:  implementation
Guyu
pre-training and fine-tuning framework for text generation
Stars: ✭ 144 (-16.28%)
Mutual labels:  text-generation
Tupe
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (-16.86%)
Mutual labels:  transformer
Hrnet Semantic Segmentation
The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Stars: ✭ 2,369 (+1277.33%)
Mutual labels:  transformer
Embedding As Service
One-Stop Solution to encode sentence to fixed length vectors from various embedding techniques
Stars: ✭ 151 (-12.21%)
Mutual labels:  transformer
Nlp research
NLP research:基于tensorflow的nlp深度学习项目,支持文本分类/句子匹配/序列标注/文本生成 四大任务
Stars: ✭ 141 (-18.02%)
Mutual labels:  transformer
Kogpt2 Finetuning
🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥
Stars: ✭ 124 (-27.91%)
Mutual labels:  text-generation
Transformer In Generating Dialogue
An Implementation of 'Attention is all you need' with Chinese Corpus
Stars: ✭ 121 (-29.65%)
Mutual labels:  transformer
Effective transformer
Running BERT without Padding
Stars: ✭ 169 (-1.74%)
Mutual labels:  transformer

GPT-2 Pre-training and text generation, implemented in Tensorflow 2.0

Originally implemented in tensorflow 1.14 by OapenAi :- "openai/gpt-2". OpenAi GPT-2 Paper:-"Language Models are Unsupervised Multitask Learners"

**This repository has OpenAi GPT-2 pre-training and sequence generation implementation in tensorflow 2.0, **

Requirements

  • python >= 3.6
  • setuptools==41.0.1
  • ftfy==5.6
  • tqdm==4.32.1
  • Click==7.0
  • sentencepiece==0.1.83
  • tensorflow-gpu==2.3.0
  • numpy==1.16.4

Setup

$ git clone https://github.com/akanyaani/gpt-2-tensorflow2.0
$ cd gpt-2-tensorflow2.0
$ pip install -r requirements.txt

You can pre-train the model using sample data available in repository or you can download the data using this github repo https://github.com/eukaryote31/openwebtext

Pre-Training model on sample data available in repository

$ python pre_process.py --help

Options:
  --data-dir TEXT        training data path  [default: /data/scraped]
  --vocab-size INTEGER   byte pair vocab size  [default: 24512]
  --min-seq-len INTEGER  minimum sequence length  [default: 15]
  --max-seq-len INTEGER  maximum sequence length  [default: 512]
  --help                 Show this message and exit.
  
  
>> python pre_process.py

Pre-Training model on openwebtext or any other data

>> python pre_process.py --data-dir=data_directory --vocab-size=32000
$ python train_gpt2.py --help

Options:
  --num-layers INTEGER      No. of decoder layers  [default: 8]
  --embedding-size INTEGER  Embedding size  [default: 768]
  --num-heads INTEGER       Number of heads  [default: 8]
  --dff INTEGER             Filter Size  [default: 3072]
  --max-seq-len INTEGER     Seq length  [default: 515]
  --vocab-size INTEGER      Vocab size  [default: 24512]
  --optimizer TEXT          optimizer type  [default: adam]
  --batch-size INTEGER      batch size  [default: 8]
  --learning-rate FLOAT     learning rate  [default: 0.001]
  --graph-mode BOOLEAN      TF run mode  [default: False]
  --distributed BOOLEAN     distributed training  [default: False]
  --help                    Show this message and exit.
  
  
>> python train_gpt2.py \
  --num-layers=8 \
  --num-heads=8 \
  --dff=3072 \
  --embedding-size=768 \
  --batch-size=32 \
  --learning-rate=5e-5
  --graph-mode=True

Distributed training on multiple gpu.

>> python train_gpt2.py \
  --num-layers=8 \
  --num-heads=8 \
  --dff=3072 \
  --embedding-size=768 \
  --batch-size=32 \
  --learning-rate=5e-5 \
  --distributed=True \
  --graph-mode=True

Start TensorBoard through the command line.

$ tensorboard --logdir /log

After pretraining your model, you can generate sequences by giving some context to model. Open this notebook and load the pretrained model and pass context to model it will return the generated sequence.

$ sequence_generator.ipynb

TO DO

1. Parallel Preprocessing.
2. Shared weights across layers.
3. Factorized embedding.
4. Fine-Tuning wrapper.

References:

Contribution

  • Your issues and PRs are always welcome.

Author

License

Computation Graph of GPT-2 Model.

Decoder Graph

GPT-2_Graph
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].