TnerLanguage model finetuning on NER with an easy interface, and cross-domain evaluation. We released NER models finetuned on various domain via huggingface model hub.
SuggestTop-k Approximate String Matching.
LmchallengeA library & tools to evaluate predictive language models.
Nlp Librarycurated collection of papers for the nlp practitioner 📖👩🔬
Pytorch CppC++ Implementation of PyTorch Tutorials for Everyone
SpagoSelf-contained Machine Learning and Natural Language Processing library in Go
Lm Lstm CrfEmpower Sequence Labeling with Task-Aware Language Model
Lightnlp基于Pytorch和torchtext的自然语言处理深度学习框架。
Dl Nlp ReadingsMy Reading Lists of Deep Learning and Natural Language Processing
KobertKorean BERT pre-trained cased (KoBERT)
Awesome Bert NlpA curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
DebertaThe implementation of DeBERTa
Albert pytorchA Lite Bert For Self-Supervised Learning Language Representations
CtcdecoderConnectionist Temporal Classification (CTC) decoding algorithms: best path, prefix search, beam search and token passing. Implemented in Python.
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Neural spEnd-to-end ASR/LM implementation with PyTorch
CtcwordbeamsearchConnectionist Temporal Classification (CTC) decoder with dictionary and language model for TensorFlow.
Zamia SpeechOpen tools and data for cloudless automatic speech recognition
Tf chatbot seq2seq antilmSeq2seq chatbot with attention and anti-language model to suppress generic response, option for further improve by deep reinforcement learning.
Kogpt2Korean GPT-2 pretrained cased (KoGPT2)
Azureml BertEnd-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
TrankitTrankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Gpt NeoxAn implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.
Xlnet PytorchAn implementation of Google Brain's 2019 XLNet in PyTorch
Transfer NlpNLP library designed for reproducible experimentation management
BertweetBERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
BluebertBlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).
few-shot-lmThe source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)
python-arpa🐍 Python library for n-gram models in ARPA format
SDLM-pytorchCode accompanying EMNLP 2018 paper Language Modeling with Sparse Product of Sememe Experts
miniconsUtility for analyzing Transformer based representations of language.
MinTLMinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
tying-wv-and-wcImplementation for "Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling"
CodeT5Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
gpt-jA GPT-J API to use with python3 to generate text, blogs, code, and more
Word-Prediction-NgramNext Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques
FNet-pytorchUnofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
language-plannerOfficial Code for "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents"