RobbertA Dutch RoBERTa-based language model
Stars: ✭ 120 (+275%)
Bert Sklearna sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+468.75%)
SpeechtAn opensource speech-to-text software written in tensorflow
Stars: ✭ 152 (+375%)
Pytorch gbw lmPyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
Stars: ✭ 101 (+215.63%)
Char Rnn ChineseMulti-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch. Based on code of https://github.com/karpathy/char-rnn. Support Chinese and other things.
Stars: ✭ 192 (+500%)
Electra中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model
Stars: ✭ 132 (+312.5%)
Mead BaselineDeep-Learning Model Exploration and Development for NLP
Stars: ✭ 238 (+643.75%)
GetlangNatural language detection package in pure Go
Stars: ✭ 110 (+243.75%)
Gpt NeoAn implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
Stars: ✭ 1,252 (+3812.5%)
F LmLanguage Modeling
Stars: ✭ 156 (+387.5%)
Bit RnnQuantize weights and activations in Recurrent Neural Networks.
Stars: ✭ 86 (+168.75%)
PLBARTOfficial code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
Stars: ✭ 151 (+371.88%)
TupeTransformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (+346.88%)
Nlp learning结合python一起学习自然语言处理 (nlp): 语言模型、HMM、PCFG、Word2vec、完形填空式阅读理解任务、朴素贝叶斯分类器、TFIDF、PCA、SVD
Stars: ✭ 188 (+487.5%)
Kogpt2 Finetuning🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥
Stars: ✭ 124 (+287.5%)
rnn-theanoRNN(LSTM, GRU) in Theano with mini-batch training; character-level language models in Theano
Stars: ✭ 68 (+112.5%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (+253.13%)
OptimusOptimus: the first large-scale pre-trained VAE language model
Stars: ✭ 180 (+462.5%)
Easy BertA Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)
Stars: ✭ 106 (+231.25%)
Xlnet zh中文预训练XLNet模型: Pre-Trained Chinese XLNet_Large
Stars: ✭ 207 (+546.88%)
TongramsA C++ library providing fast language model queries in compressed space.
Stars: ✭ 88 (+175%)
Xlnet GenXLNet for generating language.
Stars: ✭ 164 (+412.5%)
Keras XlnetImplementation of XLNet that can load pretrained checkpoints
Stars: ✭ 159 (+396.88%)
Pytorch Openai Transformer Lm🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Stars: ✭ 1,268 (+3862.5%)
LingvoLingvo
Stars: ✭ 2,361 (+7278.13%)
Transformer LmTransformer language model (GPT-2) with sentencepiece tokenizer
Stars: ✭ 154 (+381.25%)
TF-NNLM-TKA toolkit for neural language modeling using Tensorflow including basic models like RNNs and LSTMs as well as more advanced models.
Stars: ✭ 20 (-37.5%)
Electra pytorchPretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
Stars: ✭ 149 (+365.63%)
Gpt ScrollsA collaborative collection of open-source safe GPT-3 prompts that work well
Stars: ✭ 195 (+509.38%)
Awd Lstm LmLSTM and QRNN Language Model Toolkit for PyTorch
Stars: ✭ 1,834 (+5631.25%)
Vaaku2VecLanguage Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (+112.5%)
Ld NetEfficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Stars: ✭ 148 (+362.5%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+7478.13%)
ZerothKaldi-based Korean ASR (한국어 음성인식) open-source project
Stars: ✭ 248 (+675%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (+306.25%)
Bert As Language Modelbert as language model, fork from https://github.com/google-research/bert
Stars: ✭ 185 (+478.13%)
asr2424-hour Automatic Speech Recognition
Stars: ✭ 27 (-15.62%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+10553.13%)
Keras BertImplementation of BERT that could load official pre-trained models for feature extraction and prediction
Stars: ✭ 2,264 (+6975%)
Keras Gpt 2Load GPT-2 checkpoint and generate texts
Stars: ✭ 113 (+253.13%)
Relational Rnn PytorchAn implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
Stars: ✭ 236 (+637.5%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+174093.75%)
MacbertRevisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
Stars: ✭ 167 (+421.88%)
Openseq2seqToolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (+4206.25%)
pd3f🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
Stars: ✭ 132 (+312.5%)
PycluePython toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
Stars: ✭ 91 (+184.38%)
Indic BertBERT-based Multilingual Model for Indian Languages
Stars: ✭ 160 (+400%)
Pytorch NceThe Noise Contrastive Estimation for softmax output written in Pytorch
Stars: ✭ 204 (+537.5%)
LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+6103.13%)
calmContext Aware Language Models
Stars: ✭ 29 (-9.37%)
KB-ALBERTKB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델
Stars: ✭ 215 (+571.88%)
COCO-LM[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+240.63%)
Attention MechanismsImplementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+534.38%)
Lotclass[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Stars: ✭ 160 (+400%)