Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+193.99%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+7.11%)
backpropBackprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (-89.89%)
Bert PytorchGoogle AI 2018 BERT pytorch implementation
Stars: ✭ 4,642 (+105.04%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-98.28%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+2362.1%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+124.25%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+50.57%)
Awd Lstm LmLSTM and QRNN Language Model Toolkit for PyTorch
Stars: ✭ 1,834 (-18.99%)
Keras Gpt 2Load GPT-2 checkpoint and generate texts
Stars: ✭ 113 (-95.01%)
GetlangNatural language detection package in pure Go
Stars: ✭ 110 (-95.14%)
Easy BertA Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)
Stars: ✭ 106 (-95.32%)
LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (-12.32%)
Ld NetEfficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Stars: ✭ 148 (-93.46%)
Pytorch gbw lmPyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
Stars: ✭ 101 (-95.54%)
PycluePython toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
Stars: ✭ 91 (-95.98%)
TongramsA C++ library providing fast language model queries in compressed space.
Stars: ✭ 88 (-96.11%)
Pytorch Openai Transformer Lm🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Stars: ✭ 1,268 (-43.99%)
MacbertRevisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
Stars: ✭ 167 (-92.62%)
Keras XlnetImplementation of XLNet that can load pretrained checkpoints
Stars: ✭ 159 (-92.98%)
Electra中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model
Stars: ✭ 132 (-94.17%)
Greek BertA Greek edition of BERT pre-trained language model
Stars: ✭ 84 (-96.29%)
Lingopackage lingo provides the data structures and algorithms required for natural language processing
Stars: ✭ 113 (-95.01%)
Full stack transformerPytorch library for end-to-end transformer models training, inference and serving
Stars: ✭ 71 (-96.86%)
Xlnet GenXLNet for generating language.
Stars: ✭ 164 (-92.76%)
Roberta zhRoBERTa中文预训练模型: RoBERTa for Chinese
Stars: ✭ 1,953 (-13.74%)
Cross Domain nerCross-domain NER using cross-domain language modeling, code for ACL 2019 paper
Stars: ✭ 67 (-97.04%)
ChineseglueLanguage Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Stars: ✭ 1,548 (-31.63%)
Openseq2seqToolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Stars: ✭ 1,378 (-39.13%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+11.22%)
Bert As ServiceMapping a variable-length sentence to a fixed-length vector using BERT model
Stars: ✭ 9,779 (+331.93%)
TupeTransformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (-93.68%)
Lotclass[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Stars: ✭ 160 (-92.93%)
Bit RnnQuantize weights and activations in Recurrent Neural Networks.
Stars: ✭ 86 (-96.2%)
Mt DnnMulti-Task Deep Neural Networks for Natural Language Understanding
Stars: ✭ 1,871 (-17.36%)
Bio embeddingsGet protein embeddings from protein sequences
Stars: ✭ 86 (-96.2%)
Pycorrectorpycorrector is a toolkit for text error correction. 文本纠错,Kenlm,Seq2Seq_Attention,BERT,MacBERT,ELECTRA,ERNIE,Transformer等模型实现,开箱即用。
Stars: ✭ 2,857 (+26.19%)
Nlp TutorialNatural Language Processing Tutorial for Deep Learning Researchers
Stars: ✭ 9,895 (+337.06%)
Awesome Bertbert nlp papers, applications and github resources, including the newst xlnet , BERT、XLNet 相关论文和 github 项目
Stars: ✭ 1,732 (-23.5%)
Nezha chinese pytorchNEZHA: Neural Contextualized Representation for Chinese Language Understanding
Stars: ✭ 65 (-97.13%)
F LmLanguage Modeling
Stars: ✭ 156 (-93.11%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-94.26%)
Gpt2PyTorch Implementation of OpenAI GPT-2
Stars: ✭ 64 (-97.17%)
Char rnn lm zhlanguage model in Chinese,基于Pytorch官方文档实现
Stars: ✭ 57 (-97.48%)
Gpt NeoAn implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
Stars: ✭ 1,252 (-44.7%)
Transformer LmTransformer language model (GPT-2) with sentencepiece tokenizer
Stars: ✭ 154 (-93.2%)
Kogpt2 Finetuning🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥
Stars: ✭ 124 (-94.52%)
PhonlpPhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)
Stars: ✭ 56 (-97.53%)
TnerLanguage model finetuning on NER with an easy interface, and cross-domain evaluation. We released NER models finetuned on various domain via huggingface model hub.
Stars: ✭ 54 (-97.61%)
SuggestTop-k Approximate String Matching.
Stars: ✭ 50 (-97.79%)
SpeechtAn opensource speech-to-text software written in tensorflow
Stars: ✭ 152 (-93.29%)
Fast BertSuper easy library for BERT based NLP models
Stars: ✭ 1,678 (-25.88%)