All Projects → sarnikowski → bert_in_a_flask

sarnikowski / bert_in_a_flask

Licence: MIT License
A dockerized flask API, serving ALBERT and BERT predictions using TensorFlow 2.0.

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to bert in a flask

NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-28.12%)
Mutual labels:  transformer, albert, bert
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (+68.75%)
Mutual labels:  transformer, bert
keras-bert-ner
Keras solution of Chinese NER task using BiLSTM-CRF/BiGRU-CRF/IDCNN-CRF model with Pretrained Language Model: supporting BERT/RoBERTa/ALBERT
Stars: ✭ 7 (-78.12%)
Mutual labels:  albert, bert
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (-31.25%)
Mutual labels:  albert, bert
PDN
The official PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf '21)
Stars: ✭ 44 (+37.5%)
Mutual labels:  transformer, bert
SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-28.12%)
Mutual labels:  transformer, bert
golgotha
Contextualised Embeddings and Language Modelling using BERT and Friends using R
Stars: ✭ 39 (+21.88%)
Mutual labels:  transformer, bert
classifier multi label seq2seq attention
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search
Stars: ✭ 26 (-18.75%)
Mutual labels:  albert, bert
are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?"
Stars: ✭ 128 (+300%)
Mutual labels:  transformer, bert
CLUE pytorch
CLUE baseline pytorch CLUE的pytorch版本基线
Stars: ✭ 72 (+125%)
Mutual labels:  albert, bert
semantic-document-relations
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"
Stars: ✭ 21 (-34.37%)
Mutual labels:  transformer, bert
tfbert
基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
Stars: ✭ 54 (+68.75%)
Mutual labels:  albert, bert
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (-25%)
Mutual labels:  transformer, bert
text-generation-transformer
text generation based on transformer
Stars: ✭ 36 (+12.5%)
Mutual labels:  transformer, bert
classifier multi label
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 127 (+296.88%)
Mutual labels:  albert, bert
transformer-models
Deep Learning Transformer models in MATLAB
Stars: ✭ 90 (+181.25%)
Mutual labels:  transformer, bert
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-31.25%)
Mutual labels:  transformer, bert
Kevinpro-NLP-demo
All NLP you Need Here. 个人实现了一些好玩的NLP demo,目前包含13个NLP应用的pytorch实现
Stars: ✭ 117 (+265.63%)
Mutual labels:  transformer, bert
bert nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Stars: ✭ 97 (+203.13%)
Mutual labels:  albert, bert
ALBERT-Pytorch
Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)
Stars: ✭ 214 (+568.75%)
Mutual labels:  albert, bert

BERT & ALBERT in a flask

This repo demonstrates how to serve both vanilla BERT and ALBERT predictions through a simple dockerized flask API. The repo contains implementation of both google-research/bert and google-research/ALBERT for multiclass classification purposes. Model implementation is based on kpe/bert-for-tf2.

The code is written to be compatible with Tensorflow 2.0, and Docker>=19. If you want to train with a GPU, you will also need to have nvidia-container-toolkit installed. Instructions on how to install this can be found here. The project uses the Stackoverflow dataset for demonstration purposes.

NEWS

  • 04-02-2020 - Now supports google-research/ALBERT. Data and model loading has been simplified and improved.
  • 06-01-2020 - Codebase has been overhauled and ported to Tensorflow 2.0. The implementation is now based on kpe/bert-for-tf2.
  • 25-10-2019 - Bugfixes. Code is now runnable, with a data example. Codebase still under construction, considering upgrading the project to TF 2.0

USAGE

The most common docker commands used in this project, have been wrapped in Makefile targets for convenience. Training and API serving is executed in docker containers. The corresponding Dockerfiles are dev.Dockerfile for training/development and api.Dockerfile for the API. Building these can be achieved using the following targets.

make build_dev
make build_api

The development docker image uses docker volumes and boxboat/fixuid, to mount the local folder with proper permissions for easier development and debugging. The API docker image on the other hand copies the code in to the image, on build. This means you should rerun make build_api whenever you make changes to the serving code, or train a new model you want to serve.

TRAINING

Pretrained models

A number of pretrained models from both google-research/bert and google-research/ALBERT, are supported. Below are two lists of supported BERT and ALBERT models:

BERT

ALBERT

The training script automatically adapts to either type of model. Simply specify which pretrained model you want to use by changing the variable model_name in the file src/train.py. Accepted model values are the ones listed above, however they can also be found in /src/utils/loader.py.

Parameters

Below is a list of high level parameters, that can be adjusted in src/train.py. Notice the default values in src/train.py are given as an example, and should not be seen as optimal.

Name Description
batch_size Batch size used during training.
max_seq_len Upper bound on the length of input sequences.
n_epochs Total number of epochs to run.
max_learn_rate The upper bound on the learning rate.
min_learn_rate The lower bound on the learning rate.
warmup_proportion Fraction of total epochs to do linear warmup.

By default the learning rate used during training, is modelled to be linearly growing towards an upper bound until a given fraction of total epochs is reached, then it decays exponentially to a lower bound. If you do not wish to use this behaviour, remove the callback learning_rate_callback from the list of keras callbacks.

Running

To train the model run the following command:

make train TENSORBOARD_PORT=<PORT> GPU=<GPU>

Tensorboard is intialized during training in models/trained, and the port defaults to 6006. The GPU parameter which expects an integer (defaults to 1) specifies the GPU you want to use for training (only relevant if you have more than one GPU). This code is written for single GPU setups.

Model files and tensorboard logs are exported to a datetime stamped folder under models/trained during training. Furthermore a confusion matrix plot is generated and saved to this directory upon finishing training, which is calculated from the model performance on the test set.

Out of memory exceptions

Depending on your GPU hardware you might experience oom exceptions. To avoid this either reduce the batch_size or max_seq_len and/or choose a smaller pretrained base model.

INFERENCE

To serve predictions, a minimal working example of a dockerized flask API is provided. The API loads the latest trained model, by looking in the file models/latest_model_config.json, which is overwritten everytime a new model is trained. Modify this file manually if you wish to use a specific model.

The API can be booted up using the command:

make start_api api_port=<PORT>

Once the API container is running, requests can be made in the following form using the assigned API port:

curl -H "Content-Type: application/json" --request POST --data '<JSON_OBJECT>' http://localhost:<PORT>/predict

The JSON object should have the following format. Notice the input is a list, making it possible to parse a list of input bodies:

{
    "x": ["Should i use public or private when declaring a variable in a class?", "I get an ImportError everytime i try to import a module in my main.py script"]
}

The API does a minimum logging of each call made. The format of the logs is defined in src/app.py. Below is an example of a request log:

{
    "endpoint": "/predict", 
    "response": {   
        "model": "BERT 2020-01-06 22:06:23 4b4e44b2", 
        "predictions": ["java", "python"], 
        "probabilities": [0.9852411150932312, 0.9999822378158569]
    }, 
    "status_code": 200, 
    "response_time": 0.48097777366638184, 
    "user_ip": null, 
    "user_agent": "curl/7.58.0"
}

LICENSE

MIT (License file)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].