All Projects → graykode → Gpt 2 Pytorch

graykode / Gpt 2 Pytorch

Licence: mit
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Gpt 2 Pytorch

Awesome Semi Supervised Learning
📜 An up-to-date & curated list of awesome semi-supervised learning papers, methods & resources.
Stars: ✭ 538 (-12.94%)
Mutual labels:  natural-language-processing
Bert score
BERT score for text generation
Stars: ✭ 568 (-8.09%)
Mutual labels:  natural-language-processing
Talisman
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Stars: ✭ 584 (-5.5%)
Mutual labels:  natural-language-processing
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (-12.14%)
Mutual labels:  natural-language-processing
Pythoncode Tutorials
The Python Code Tutorials
Stars: ✭ 544 (-11.97%)
Mutual labels:  natural-language-processing
Self Attentive Parser
High-accuracy NLP parser with models for 11 languages.
Stars: ✭ 569 (-7.93%)
Mutual labels:  natural-language-processing
Ner Lstm
Named Entity Recognition using multilayered bidirectional LSTM
Stars: ✭ 532 (-13.92%)
Mutual labels:  natural-language-processing
Stanza
Official Stanford NLP Python Library for Many Human Languages
Stars: ✭ 5,887 (+852.59%)
Mutual labels:  natural-language-processing
Mycroft Core
Mycroft Core, the Mycroft Artificial Intelligence platform.
Stars: ✭ 5,489 (+788.19%)
Mutual labels:  natural-language-processing
Pythainlp
Thai Natural Language Processing in Python.
Stars: ✭ 582 (-5.83%)
Mutual labels:  natural-language-processing
Sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
Stars: ✭ 5,540 (+796.44%)
Mutual labels:  natural-language-processing
Hanlp
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Stars: ✭ 24,626 (+3884.79%)
Mutual labels:  natural-language-processing
Fast abs rl
Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"
Stars: ✭ 569 (-7.93%)
Mutual labels:  natural-language-processing
Awesome Korean Nlp
A curated list of resources for NLP (Natural Language Processing) for Korean
Stars: ✭ 541 (-12.46%)
Mutual labels:  natural-language-processing
Dnc Tensorflow
A TensorFlow implementation of DeepMind's Differential Neural Computers (DNC)
Stars: ✭ 587 (-5.02%)
Mutual labels:  implementation
Leakgan
The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.
Stars: ✭ 533 (-13.75%)
Mutual labels:  natural-language-processing
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (-8.25%)
Mutual labels:  natural-language-processing
Speedtorch
Library for faster pinned CPU <-> GPU transfer in Pytorch
Stars: ✭ 615 (-0.49%)
Mutual labels:  natural-language-processing
Hazm
Python library for digesting Persian text.
Stars: ✭ 595 (-3.72%)
Mutual labels:  natural-language-processing
Attention Is All You Need Pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Stars: ✭ 6,070 (+882.2%)
Mutual labels:  natural-language-processing

GPT2-Pytorch with Text-Generator

Better Language Models and Their Implications

Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. from openAI Blog

This repository is simple implementation GPT-2 about text-generator in Pytorch with compress code

Quick Start

  1. download GPT2 pre-trained model in Pytorch which huggingface/pytorch-pretrained-BERT already made! (Thanks for sharing! it's help my problem transferring tensorflow(ckpt) file to Pytorch Model!)
$ git clone https://github.com/graykode/gpt-2-Pytorch && cd gpt-2-Pytorch
# download huggingface's pytorch model 
$ curl --output gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin
# setup requirements, if using mac os, then run additional setup as descibed below
$ pip install -r requirements.txt
  1. Now, You can run like this.
  • Text from Book 1984, George Orwell
$ python main.py --text "It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent a swirl of gritty dust from entering along with him."
  1. Also You can Quick Starting in Google Colab

Option

  • --text : sentence to begin with.
  • --quiet : not print all of the extraneous stuff like the "================"
  • --nsamples : number of sample sampled in batch when multinomial function use
  • --unconditional : If true, unconditional generation.
  • --batch_size : number of batch size
  • --length : sentence length (< number of context)
  • --temperature: the thermodynamic temperature in distribution (default 0.7)
  • --top_k : Returns the top k largest elements of the given input tensor along a given dimension. (default 40)

See more detail option about temperature and top_k in here

Dependencies

  • Pytorch 0.41+
  • regex 2017.4.5

Mac OS Setup

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install torch tqdm
$ brew install libomp
$ export LC_ALL=en_US.UTF-8
$ export LANG=en_US.UTF-8
$ pip install -r requirements.txt

Author

License

  • OpenAi/GPT2 follow MIT license, huggingface/pytorch-pretrained-BERT is Apache license.
  • I follow MIT license with original GPT2 repository

Acknowledgement

Jeff Wu(@WuTheFWasThat), Thomas Wolf(@thomwolf) for allowing referring code.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].