All Projects → graykode → Xlnet Pytorch

graykode / Xlnet Pytorch

Licence: apache-2.0
Simple XLNet implementation with Pytorch Wrapper

Projects that are alternatives of or similar to Xlnet Pytorch

Biosentvec
BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences
Stars: ✭ 308 (-38.52%)
Mutual labels:  jupyter-notebook, natural-language-processing
Nlp Python Deep Learning
NLP in Python with Deep Learning
Stars: ✭ 374 (-25.35%)
Mutual labels:  jupyter-notebook, natural-language-processing
Nlp Papers With Arxiv
Statistics and accepted paper list of NLP conferences with arXiv link
Stars: ✭ 345 (-31.14%)
Mutual labels:  jupyter-notebook, natural-language-processing
Nlp Tutorial
Tutorial: Natural Language Processing in Python
Stars: ✭ 274 (-45.31%)
Mutual labels:  jupyter-notebook, natural-language-processing
Practical Pytorch
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Stars: ✭ 4,329 (+764.07%)
Mutual labels:  jupyter-notebook, natural-language-processing
Adaptnlp
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.
Stars: ✭ 278 (-44.51%)
Mutual labels:  jupyter-notebook, natural-language-processing
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (-37.13%)
Mutual labels:  jupyter-notebook, natural-language-processing
Malaya
Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/
Stars: ✭ 239 (-52.3%)
Mutual labels:  jupyter-notebook, natural-language-processing
Code search
Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"
Stars: ✭ 436 (-12.97%)
Mutual labels:  jupyter-notebook, natural-language-processing
Anlp19
Course repo for Applied Natural Language Processing (Spring 2019)
Stars: ✭ 402 (-19.76%)
Mutual labels:  jupyter-notebook, natural-language-processing
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (-47.11%)
Mutual labels:  jupyter-notebook, natural-language-processing
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (-9.38%)
Mutual labels:  jupyter-notebook, natural-language-processing
Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+587.23%)
Mutual labels:  jupyter-notebook, natural-language-processing
Zhihu
This repo contains the source code in my personal column (https://zhuanlan.zhihu.com/zhaoyeyu), implemented using Python 3.6. Including Natural Language Processing and Computer Vision projects, such as text generation, machine translation, deep convolution GAN and other actual combat code.
Stars: ✭ 3,307 (+560.08%)
Mutual labels:  jupyter-notebook, natural-language-processing
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+540.52%)
Mutual labels:  jupyter-notebook, natural-language-processing
Question generation
Neural question generation using transformers
Stars: ✭ 356 (-28.94%)
Mutual labels:  jupyter-notebook, natural-language-processing
Deepnlp Models Pytorch
Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
Stars: ✭ 2,760 (+450.9%)
Mutual labels:  jupyter-notebook, natural-language-processing
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-52.89%)
Mutual labels:  jupyter-notebook, natural-language-processing
Transformers Tutorials
Github repo with tutorials to fine tune transformers for diff NLP tasks
Stars: ✭ 384 (-23.35%)
Mutual labels:  jupyter-notebook, natural-language-processing
Practical Nlp
Official Repository for 'Practical Natural Language Processing' by O'Reilly Media
Stars: ✭ 452 (-9.78%)
Mutual labels:  jupyter-notebook, natural-language-processing

XLNet-Pytorch arxiv:1906.08237

Simple XLNet implementation with Pytorch Wrapper!

You can see How XLNet Architecture work in pre-training with small batch size(=1) example.

To Usage

$ git clone https://github.com/graykode/xlnet-Pytorch && cd xlnet-Pytorch

# To use Sentence Piece Tokenizer(pretrained-BERT Tokenizer)
$ pip install pytorch_pretrained_bert

$ python main.py --data ./data.txt --tokenizer bert-base-uncased \
   --seq_len 512 --reuse_len 256 --perm_size 256 \
   --bi_data True --mask_alpha 6 --mask_beta 1 \
   --num_predict 85 --mem_len 384 --num_epoch 100

Also, You can run code in Google Colab easily.

  • Hyperparameters for Pretraining in Paper.

#### Option
  • —data(String) : .txt file to train. It doesn't matter multiline text. Also, one file will be one batch tensor. Default : data.txt

  • —tokenizer(String) : I just used huggingface/pytorch-pretrained-BERT's Tokenizer as subword tokenizer(I'll edit it to sentence piece soon). you can choose in bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased. Default : bert-base-uncased

  • —seq_len(Integer) : Sequence length. Default : 512

  • —reuse_len(Interger) : Number of token that can be reused as memory. Could be half of seq_len. Default : 256

  • —perm_size(Interger) : the length of longest permutation. Could be set to be reuse_len. Default : 256

  • --bi_data(Boolean) : whether to create bidirectional data. If bi_data is True, biz(batch size) should be even number. Default : False

  • —mask_alpha(Interger) : How many tokens to form a group. Defalut : 6

  • —mask_beta(Integer) : How many tokens to mask within each group. Default : 1

  • —num_predict(Interger) : Num of tokens to predict. In Paper, it mean Partial Prediction. Default : 85

  • —mem_len(Interger) : Number of steps to cache in Transformer-XL Architecture. Default : 384

  • —num_epoch(Interger) : Number of Epoch. Default : 100

What is XLNet?

XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective. Additionally, XLNet employs Transformer-XL as the backbone model, exhibiting excellent performance for language tasks involving long context.

Model MNLI QNLI QQP RTE SST-2 MRPC CoLA STS-B
BERT 86.6 92.3 91.3 70.4 93.2 88.0 60.6 90.0
XLNet 89.8 93.9 91.8 83.8 95.6 89.2 63.6 91.8

Keyword in XLNet

  1. How did XLNet benefit from Auto-Regression and Auto-Encoding models?

    • Auto-Regression Model
    • Auto-Encoding Model
  2. Permutation Language Modeling with Partial Prediction

    • Permutation Language Modeling

    • Partial Prediction

  3. Two-Stream Self-Attention with Target-Aware Representation

    • Two-Stram Self-Attention

    • Target-Aware Representation

Author

  • Because the original repository is subject to the Apache2.0 license, it is subject to the same license.
  • Tae Hwan Jung(Jeff Jung) @graykode, Kyung Hee Univ CE(Undergraduate).
  • Author Email : [email protected]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].