All Categories → Machine Learning → language-model

Top 153 language-model open source projects

Zeroth
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Relational Rnn Pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
Xlnet zh
中文预训练XLNet模型: Pre-Trained Chinese XLNet_Large
Pytorch Nce
The Noise Contrastive Estimation for softmax output written in Pytorch
Attention Mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Protein Sequence Embedding Iclr2019
Source code for "Learning protein sequence embeddings using information from structure" - ICLR 2019
Gpt Scrolls
A collaborative collection of open-source safe GPT-3 prompts that work well
Char Rnn Chinese
Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch. Based on code of https://github.com/karpathy/char-rnn. Support Chinese and other things.
Nlp learning
结合python一起学习自然语言处理 (nlp): 语言模型、HMM、PCFG、Word2vec、完形填空式阅读理解任务、朴素贝叶斯分类器、TFIDF、PCA、SVD
Bert As Language Model
bert as language model, fork from https://github.com/google-research/bert
Keras Bert
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Optimus
Optimus: the first large-scale pre-trained VAE language model
Macbert
Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
Gpt Neo
An implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
Indic Bert
BERT-based Multilingual Model for Indian Languages
Xlnet Gen
XLNet for generating language.
Lazynlp
Library to scrape and clean web pages to create massive datasets.
Lotclass
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Keras Xlnet
Implementation of XLNet that can load pretrained checkpoints
Transformer Lm
Transformer language model (GPT-2) with sentencepiece tokenizer
Speecht
An opensource speech-to-text software written in tensorflow
Electra pytorch
Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
Awd Lstm Lm
LSTM and QRNN Language Model Toolkit for PyTorch
Awesome Speech Recognition Speech Synthesis Papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Ld Net
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Tupe
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Electra
中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model
Chars2vec
Character-based word embeddings model based on RNN for handling real world texts
Kogpt2 Finetuning
🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥
Dynamic Memory Networks Plus Pytorch
Implementation of Dynamic memory networks plus in Pytorch
Robbert
A Dutch RoBERTa-based language model
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Lingo
package lingo provides the data structures and algorithms required for natural language processing
Keras Gpt 2
Load GPT-2 checkpoint and generate texts
Getlang
Natural language detection package in pure Go
Easy Bert
A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)
Pytorch gbw lm
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
Pyclue
Python toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
Tongrams
A C++ library providing fast language model queries in compressed space.
Bit Rnn
Quantize weights and activations in Recurrent Neural Networks.
Pytorch Openai Transformer Lm
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Bio embeddings
Get protein embeddings from protein sequences
Greek Bert
A Greek edition of BERT pre-trained language model
Full stack transformer
Pytorch library for end-to-end transformer models training, inference and serving
Nezha chinese pytorch
NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Cross Domain ner
Cross-domain NER using cross-domain language modeling, code for ACL 2019 paper
Char rnn lm zh
language model in Chinese,基于Pytorch官方文档实现
Phonlp
PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)
1-60 of 153 language-model projects