All Projects → stevezheng23 → Xlnet_extension_tf

stevezheng23 / Xlnet_extension_tf

Licence: apache-2.0
XLNet Extension in TensorFlow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Xlnet extension tf

Graphbrain
Language, Knowledge, Cognition
Stars: ✭ 294 (+169.72%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Catalyst
🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
Stars: ✭ 224 (+105.5%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Reading comprehension tf
Machine Reading Comprehension in Tensorflow
Stars: ✭ 37 (-66.06%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Articutapi
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (+131.19%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Botlibre
An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
Stars: ✭ 412 (+277.98%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Ciff
Cornell Instruction Following Framework
Stars: ✭ 23 (-78.9%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Coursera Natural Language Processing Specialization
Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.
Stars: ✭ 39 (-64.22%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Mt Dnn
Multi-Task Deep Neural Networks for Natural Language Understanding
Stars: ✭ 72 (-33.94%)
Mutual labels:  natural-language-processing, natural-language-understanding
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-29.36%)
Mutual labels:  natural-language-processing, natural-language-understanding
Simplednn
SimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectures in natural language processing tasks
Stars: ✭ 81 (-25.69%)
Mutual labels:  artificial-intelligence, natural-language-processing
Virtual Assistant
A linux based Virtual assistant on Artificial Intelligence in C
Stars: ✭ 88 (-19.27%)
Mutual labels:  artificial-intelligence, natural-language-processing
Get started with deep learning for text with allennlp
Getting started with AllenNLP and PyTorch by training a tweet classifier
Stars: ✭ 69 (-36.7%)
Mutual labels:  artificial-intelligence, natural-language-processing
Hackerrank
This is the Repository where you can find all the solution of the Problems which you solve on competitive platforms mainly HackerRank and HackerEarth
Stars: ✭ 68 (-37.61%)
Mutual labels:  artificial-intelligence, natural-language-processing
Chatbot
Русскоязычный чатбот
Stars: ✭ 106 (-2.75%)
Mutual labels:  natural-language-processing, natural-language-understanding
Intent classifier
Stars: ✭ 67 (-38.53%)
Mutual labels:  natural-language-processing, natural-language-understanding
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (-19.27%)
Mutual labels:  natural-language-processing, natural-language-understanding
Bidaf Keras
Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2
Stars: ✭ 60 (-44.95%)
Mutual labels:  natural-language-processing, natural-language-understanding
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+1065.14%)
Mutual labels:  artificial-intelligence, natural-language-processing
Bert As Service
Mapping a variable-length sentence to a fixed-length vector using BERT model
Stars: ✭ 9,779 (+8871.56%)
Mutual labels:  natural-language-processing, natural-language-understanding
Chinese nlu by using rasa nlu
使用 RASA NLU 来构建中文自然语言理解系统(NLU)| Use RASA NLU to build a Chinese Natural Language Understanding System (NLU)
Stars: ✭ 99 (-9.17%)
Mutual labels:  natural-language-processing, natural-language-understanding

XLNet Extension

XLNet is a generalized autoregressive pretraining method proposed by CMU & Google Brain, which outperforms BERT on 20 NLP tasks ranging from question answering, natural language inference, sentiment analysis, and document ranking. XLNet is inspired by the pros and cons of auto-regressive and auto-encoding methods to overcome limitation of both sides, which uses a permutation language modeling objective to learn bidirectional context and integrates ideas from Transformer-XL into model architecture. This project is aiming to provide extensions built on top of current XLNet and bring power of XLNet to other NLP tasks like NER and NLU.

Figure 1: Illustrations of fine-tuning XLNet on different tasks

Setting

  • Python 3.6.7
  • Tensorflow 1.13.1
  • NumPy 1.13.3
  • SentencePiece 0.1.82

DataSet

  • CoNLL2003 is a multi-task dataset, which contains 3 sub-tasks, POS tagging, syntactic chunking and NER. For NER sub-task, it contains 4 types of named entities: persons, locations, organizations and names of miscellaneous entities that do not belong to the previous three groups.
  • ATIS (Airline Travel Information System) is NLU dataset in airline travel domain. The dataset contains 4978 train and 893 test utterances classified into one of 26 intents, and each token in utterance is labeled with tags from 128 slot filling tags in IOB format.
  • SQuAD is a reading comprehension dataset, consisting of questions posed by crowd-workers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
  • CoQA a large-scale dataset for building Conversational Question Answering systems. The goal of the CoQA challenge is to measure the ability of machines to understand a text passage and answer a series of interconnected questions that appear in a conversation. CoQA is pronounced as coca
  • QuAC is a dataset for modeling, understanding, and participating in information seeking dialog. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context.

Usage

  • Preprocess data
python prepro/prepro_conll.py \
  --data_format json \
  --input_file data/ner/conll2003/raw/eng.xxx \
  --output_file data/ner/conll2003/xxx-conll2003/xxx-conll2003.json
  • Run experiment
CUDA_VISIBLE_DEVICES=0 python run_ner.py \
    --spiece_model_file=model/cased_L-24_H-1024_A-16/spiece.model \
    --model_config_path=model/cased_L-24_H-1024_A-16/xlnet_config.json \
    --init_checkpoint=model/cased_L-24_H-1024_A-16/xlnet_model.ckpt \
    --task_name=conll2003 \
    --random_seed=100 \
    --predict_tag=xxxxx \
    --data_dir=data/ner/conll2003 \
    --output_dir=output/ner/conll2003/data \
    --model_dir=output/ner/conll2003/checkpoint \
    --export_dir=output/ner/conll2003/export \
    --max_seq_length=128 \
    --train_batch_size=32 \
    --num_hosts=1 \
    --num_core_per_host=1 \
    --learning_rate=2e-5 \
    --train_steps=2500 \
    --warmup_steps=100 \
    --save_steps=500 \
    --do_train=true \
    --do_eval=true \
    --do_predict=true \
    --do_export=true
  • Visualize summary
tensorboard --logdir=output/ner/conll2003
  • Setup service
docker run -p 8500:8500 \
  -v output/ner/conll2003/export/xxxxx:models/ner \
  -e MODEL_NAME=ner \
  -t tensorflow/serving

Experiment

CoNLL2003-NER

Figure 2: Illustrations of fine-tuning XLNet on CoNLL2003-NER task

CoNLL2003 - NER Avg. (5-run) Best
Precision 91.36 ± 0.50 92.14
Recall 92.95 ± 0.24 93.20
F1 Score 92.15 ± 0.35 92.67

Table 1: The test set performance of XLNet-large finetuned model on CoNLL2003-NER task with setting: batch size = 16, max length = 128, learning rate = 2e-5, num steps = 4,000

ATIS-NLU

Figure 3: Illustrations of fine-tuning XLNet on ATIS-NLU task

ATIS - NLU Avg. (5-run) Best
Accuracy - Intent 97.51 ± 0.09 97.54
F1 Score - Slot 95.48 ± 0.30 95.73

Table 2: The test set performance of XLNet-large finetuned model on ATIS-NLU task with setting: batch size = 16, max length = 128, learning rate = 5e-5, num steps = 2,000

SQuAD v1.1

Figure 4: Illustrations of fine-tuning XLNet on SQuAD v1.1 task

SQuAD v1.1 Avg. (5-run) Best
Exact Match xx.xx ± x.xx 88.61
F1 Score xx.xx ± x.xx 94.28

Table 3: The test set performance of XLNet-large finetuned model on SQuAD v1.1 task with setting: batch size = 48, max sequence length = 512, max question length = 64, learning rate = 3e-5, num steps = 8,000

SQuAD v2.0

Figure 5: Illustrations of fine-tuning XLNet on SQuAD v2.0 task

SQuAD v2.0 Avg. (5-run) Best
Exact Match xx.xx ± x.xx 85.72
F1 Score xx.xx ± x.xx 88.36

Table 4: The test set performance of XLNet-large finetuned model on SQuAD v2.0 task with setting: batch size = 48, max sequence length = 512, max question length = 64, learning rate = 3e-5, num steps = 8,000

CoQA v1.0

Figure 6: Illustrations of fine-tuning XLNet on CoQA v1.0 task

CoQA v1.0 Avg. (5-run) Best
Exact Match xx.xx ± x.xx 81.8
F1 Score xx.xx ± x.xx 89.4

Table 5: The test set performance of XLNet-large finetuned model on CoQA v1.0 task with setting: batch size = 48, max sequence length = 512, max question length = 128, learning rate = 3e-5, num steps = 6,000

QuAC v0.2

Figure 7: Illustrations of fine-tuning XLNet on QuAC v0.2 task

QuAC v0.2 Avg. (5-run) Best
F1 Score xx.xx ± x.xx 71.5
HEQQ xx.xx ± x.xx 68.0
HEQD xx.xx ± x.xx 11.1

Table 6: The test set performance of XLNet-large finetuned model on QuAC v0.2 task with setting: batch size = 48, max sequence length = 512, max question length = 128, learning rate = 2e-5, num steps = 8,000

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].