All Projects → xu-song → Bert As Language Model

xu-song / Bert As Language Model

Licence: apache-2.0
bert as language model, fork from https://github.com/google-research/bert

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Bert As Language Model

Tupe
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (-22.7%)
Mutual labels:  language-model
F Lm
Language Modeling
Stars: ✭ 156 (-15.68%)
Mutual labels:  language-model
Gpt Neo
An implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
Stars: ✭ 1,252 (+576.76%)
Mutual labels:  language-model
Awesome Speech Recognition Speech Synthesis Papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Stars: ✭ 2,085 (+1027.03%)
Mutual labels:  language-model
Speecht
An opensource speech-to-text software written in tensorflow
Stars: ✭ 152 (-17.84%)
Mutual labels:  language-model
Lotclass
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Stars: ✭ 160 (-13.51%)
Mutual labels:  language-model
Electra
中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model
Stars: ✭ 132 (-28.65%)
Mutual labels:  language-model
Keras Bert
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Stars: ✭ 2,264 (+1123.78%)
Mutual labels:  language-model
Transformer Lm
Transformer language model (GPT-2) with sentencepiece tokenizer
Stars: ✭ 154 (-16.76%)
Mutual labels:  language-model
Indic Bert
BERT-based Multilingual Model for Indian Languages
Stars: ✭ 160 (-13.51%)
Mutual labels:  language-model
Awd Lstm Lm
LSTM and QRNN Language Model Toolkit for PyTorch
Stars: ✭ 1,834 (+891.35%)
Mutual labels:  language-model
Electra pytorch
Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
Stars: ✭ 149 (-19.46%)
Mutual labels:  language-model
Lazynlp
Library to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+972.97%)
Mutual labels:  language-model
Ld Net
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Stars: ✭ 148 (-20%)
Mutual labels:  language-model
Macbert
Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
Stars: ✭ 167 (-9.73%)
Mutual labels:  language-model
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+1210.81%)
Mutual labels:  language-model
Keras Xlnet
Implementation of XLNet that can load pretrained checkpoints
Stars: ✭ 159 (-14.05%)
Mutual labels:  language-model
Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (-1.62%)
Mutual labels:  language-model
Optimus
Optimus: the first large-scale pre-trained VAE language model
Stars: ✭ 180 (-2.7%)
Mutual labels:  language-model
Xlnet Gen
XLNet for generating language.
Stars: ✭ 164 (-11.35%)
Mutual labels:  language-model

BERT as Language Model

For a sentence S = w_1, w_2,..., w_k , we have

p(S) = \prod_{i=1}^{k} p(w_i | context)

In traditional language model, such as RNN, context = w_1, ..., w_{i-1} ,

p(S) = \prod_{i=1}^{k} p(w_i | w_1, ..., w_{i-1})

In bidirectional language model, it has larger context, context = w_1, ..., w_{i-1},w_{i+1},...,w_k.

In this implementation, we simply adopt the following approximation,

p(S) \approx \prod_{i=1}^{k} p(w_i | w_1, ..., w_{i-1},w_{i+1}, ...,w_k).

test-case

more cases: 中文

export BERT_BASE_DIR=model/uncased_L-12_H-768_A-12
export INPUT_FILE=data/lm/test.en.tsv
python run_lm_predict.py \
  --input_file=$INPUT_FILE \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --output_dir=/tmp/lm_output/

for the following test case

$ cat data/lm/test.en.tsv 
there is a book on the desk
there is a plane on the desk
there is a book in the desk

$ cat /tmp/lm/output/test_result.json

output:

# prob: probability
# ppl:  perplexity
[
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.9988962411880493
      },
      {
        "token": "is",
        "prob": 0.013578361831605434
      },
      {
        "token": "a",
        "prob": 0.9420605897903442
      },
      {
        "token": "book",
        "prob": 0.07452250272035599
      },
      {
        "token": "on",
        "prob": 0.9607976675033569
      },
      {
        "token": "the",
        "prob": 0.4983428418636322
      },
      {
        "token": "desk",
        "prob": 4.040586190967588e-06
      }
    ],
    "ppl": 17.69329728285426
  },
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.996775209903717
      },
      {
        "token": "is",
        "prob": 0.03194097802042961
      },
      {
        "token": "a",
        "prob": 0.8877727389335632
      },
      {
        "token": "plane",
        "prob": 3.4907534427475184e-05   # low probability
      },
      {
        "token": "on",
        "prob": 0.1902322769165039
      },
      {
        "token": "the",
        "prob": 0.5981084704399109
      },
      {
        "token": "desk",
        "prob": 3.3164762953674654e-06
      }
    ],
    "ppl": 59.646456254851806
  },
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.9969795942306519
      },
      {
        "token": "is",
        "prob": 0.03379646688699722
      },
      {
        "token": "a",
        "prob": 0.9095568060874939
      },
      {
        "token": "book",
        "prob": 0.013939591124653816
      },
      {
        "token": "in",
        "prob": 0.000823647016659379  # low probability
      },
      {
        "token": "the",
        "prob": 0.5844194293022156
      },
      {
        "token": "desk",
        "prob": 3.3361218356731115e-06
      }
    ],
    "ppl": 54.65941516205144
  }
]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].