Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lonePatient → Albert_pytorch

lonePatient / Albert_pytorch

Licence: apache-2.0

A Lite Bert For Self-Supervised Learning Language Representations

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch nlp language-model mask

Projects that are alternatives of or similar to Albert pytorch

Bertweet

BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

Stars: ✭ 282 (-47.68%)

Mutual labels: language-model

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-30.98%)

Mutual labels: mask

Tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Stars: ✭ 5,077 (+841.93%)

Mutual labels: language-model

Xlnet Pytorch

An implementation of Google Brain's 2019 XLNet in PyTorch

Stars: ✭ 304 (-43.6%)

Mutual labels: language-model

Kogpt2

Korean GPT-2 pretrained cased (KoGPT2)

Stars: ✭ 368 (-31.73%)

Mutual labels: language-model

Ctcwordbeamsearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model for TensorFlow.

Stars: ✭ 398 (-26.16%)

Mutual labels: language-model

Fuzzing Imagemagick

OpenSource My ImageMagick Fuzzer ..

Stars: ✭ 270 (-49.91%)

Mutual labels: mask

Maskara

A simple way to format text fields without getting affected by input filters

Stars: ✭ 515 (-4.45%)

Mutual labels: mask

Tf chatbot seq2seq antilm

Seq2seq chatbot with attention and anti-language model to suppress generic response, option for further improve by deep reinforcement learning.

Stars: ✭ 369 (-31.54%)

Mutual labels: language-model

Bert Pytorch

Google AI 2018 BERT pytorch implementation

Stars: ✭ 4,642 (+761.22%)

Mutual labels: language-model

Gpt Neox

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Stars: ✭ 303 (-43.78%)

Mutual labels: language-model

Azureml Bert

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

Stars: ✭ 342 (-36.55%)

Mutual labels: language-model

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (-24.3%)

Mutual labels: language-model

Transfer Nlp

NLP library designed for reproducible experimentation management

Stars: ✭ 287 (-46.75%)

Mutual labels: language-model

Jquery Mask Plugin

A jQuery Plugin to make masks on form fields and HTML elements.

Stars: ✭ 4,534 (+741.19%)

Mutual labels: mask

Bluebert

BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).

Stars: ✭ 273 (-49.35%)

Mutual labels: language-model

Zamia Speech

Open tools and data for cloudless automatic speech recognition

Stars: ✭ 374 (-30.61%)

Mutual labels: language-model

Ctcdecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, prefix search, beam search and token passing. Implemented in Python.

Stars: ✭ 529 (-1.86%)

Mutual labels: language-model

Nlp Paper

NLP Paper

Stars: ✭ 484 (-10.2%)

Mutual labels: language-model

Edittext Mask

The custom masks for EditText. The solution for input phone numbers, SSN, and so on for Android

Stars: ✭ 413 (-23.38%)

Mutual labels: mask

View All Similar Projects ➔

English Version | 中文版说明

albert_pytorch

This repository contains a PyTorch implementation of the albert model from the paper

A Lite Bert For Self-Supervised Learning Language Representations

by Zhenzhong Lan. Mingda Chen....

Dependencies

pytorch=1.10
cuda=9.0
cudnn=7.5
scikit-learn
sentencepiece

Download Pre-trained Models of English

Official download links: google albert

Adapt to this version，download pytorch model (google drive):

Fine-tuning

１. Place config.json and 30k-clean.model into the prev_trained_model/albert_base_v2 directory. example:

├── prev_trained_model
|  └── albert_base_v2
|  |  └── pytorch_model.bin
|  |  └── config.json
|  |  └── 30k-clean.model

2．convert albert tf checkpoint to pytorch

python convert_albert_tf_checkpoint_to_pytorch.py \
    --tf_checkpoint_path=./prev_trained_model/albert_base_tf_v2 \
    --bert_config_file=./prev_trained_model/albert_base_v2/config.json \
    --pytorch_dump_path=./prev_trained_model/albert_base_v2/pytorch_model.bin

The General Language Understanding Evaluation (GLUE) benchmark is a collection of nine sentence- or sentence-pair language understanding tasks for evaluating and analyzing natural language understanding systems.

Before running anyone of these GLUE tasks you should download the GLUE data by running this script and unpack it to some directory $DATA_DIR.

3．run sh scripts/run_classifier_sst2.shto fine tuning albert model

Result

Performance of ALBERT on GLUE benchmark results using a single-model setup on dev:

	Cola	Sst-2	Mnli	Sts-b
metric	matthews_corrcoef	accuracy	accuracy	pearson

model	Cola	Sst-2	Mnli	Sts-b
albert_base_v2	0.5756	0.926	0.8418	0.9091
albert_large_v2	0.5851	0.9507		0.9151
albert_xlarge_v2	0.6023			0.9221

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 539

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (26) 🔗