Vaaku2VecLanguage Modeling and Text Classification in Malayalam Language using ULMFiT
Stars: ✭ 68 (-63.83%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+3440.43%)
Lightnlp基于Pytorch和torchtext的自然语言处理深度学习框架。
Stars: ✭ 739 (+293.09%)
Lotclass[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Stars: ✭ 160 (-14.89%)
Skip Thoughts.torchPorting of Skip-Thoughts pretrained models from Theano to PyTorch & Torch7
Stars: ✭ 146 (-22.34%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+1189.89%)
WordvectorsPre-trained word vectors of 30+ languages
Stars: ✭ 2,043 (+986.7%)
F LmLanguage Modeling
Stars: ✭ 156 (-17.02%)
LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+955.85%)
Ld NetEfficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Stars: ✭ 148 (-21.28%)
DebiasweRemove problematic gender bias from word embeddings.
Stars: ✭ 175 (-6.91%)
Word2vecGo library for performing computations in word2vec binary models
Stars: ✭ 143 (-23.94%)
Keras XlnetImplementation of XLNet that can load pretrained checkpoints
Stars: ✭ 159 (-15.43%)
Word2vec对 ansj 编写的 Word2VEC_java 的进一步包装,同时实现了常用的词语相似度和句子相似度计算。
Stars: ✭ 136 (-27.66%)
WebvectorsWeb-ify your word2vec: framework to serve distributional semantic models online
Stars: ✭ 154 (-18.09%)
Electra中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model
Stars: ✭ 132 (-29.79%)
MacbertRevisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP)
Stars: ✭ 167 (-11.17%)
Skip Gram PytorchA complete pytorch implementation of skip-gram
Stars: ✭ 153 (-18.62%)
Kogpt2 Finetuning🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥
Stars: ✭ 124 (-34.04%)
Xlnet GenXLNet for generating language.
Stars: ✭ 164 (-12.77%)
Awd Lstm LmLSTM and QRNN Language Model Toolkit for PyTorch
Stars: ✭ 1,834 (+875.53%)
SplitterA Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Stars: ✭ 177 (-5.85%)
Textfeatures👷♂️ A simple package for extracting useful features from character objects 👷♀️
Stars: ✭ 148 (-21.28%)
DanmfA sparsity aware implementation of "Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection" (CIKM 2018).
Stars: ✭ 161 (-14.36%)
Fasttext4j Implementing Facebook's FastText with java
Stars: ✭ 148 (-21.28%)
Keras BertImplementation of BERT that could load official pre-trained models for feature extraction and prediction
Stars: ✭ 2,264 (+1104.26%)
Entity2recentity2rec generates item recommendation using property-specific knowledge graph embeddings
Stars: ✭ 159 (-15.43%)
TupeTransformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
Stars: ✭ 143 (-23.94%)
Deep Math Machine Learning.aiA blog which talks about machine learning, deep learning algorithms and the Math. and Machine learning algorithms written from scratch.
Stars: ✭ 173 (-7.98%)
Nlp researchNLP research:基于tensorflow的nlp深度学习项目,支持文本分类/句子匹配/序列标注/文本生成 四大任务
Stars: ✭ 141 (-25%)
GensimTopic Modelling for Humans
Stars: ✭ 12,763 (+6688.83%)
Mlsourced.ml is a library and command line tools to build and apply machine learning models on top of Universal Abstract Syntax Trees
Stars: ✭ 136 (-27.66%)
Role2vecA scalable Gensim implementation of "Learning Role-based Graph Embeddings" (IJCAI 2018).
Stars: ✭ 134 (-28.72%)
Text2vectext2vec, chinese text to vetor.(文本向量化表示工具,包括词向量化、句子向量化、句子相似度计算)
Stars: ✭ 155 (-17.55%)
Scattertext PydataNotebooks for the Seattle PyData 2017 talk on Scattertext
Stars: ✭ 132 (-29.79%)
Log Anomaly DetectorLog Anomaly Detection - Machine learning to detect abnormal events logs
Stars: ✭ 169 (-10.11%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-30.85%)
Transformer LmTransformer language model (GPT-2) with sentencepiece tokenizer
Stars: ✭ 154 (-18.09%)
Ml ProjectsML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python
Stars: ✭ 127 (-32.45%)
OptimusOptimus: the first large-scale pre-trained VAE language model
Stars: ✭ 180 (-4.26%)
SpeechtAn opensource speech-to-text software written in tensorflow
Stars: ✭ 152 (-19.15%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+815.96%)
Gpt NeoAn implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
Stars: ✭ 1,252 (+565.96%)
GraphwavemachineA scalable implementation of "Learning Structural Node Embeddings Via Diffusion Wavelets (KDD 2018)".
Stars: ✭ 151 (-19.68%)
RobbertA Dutch RoBERTa-based language model
Stars: ✭ 120 (-36.17%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+1713.3%)
Embedding As ServiceOne-Stop Solution to encode sentence to fixed length vectors from various embedding techniques
Stars: ✭ 151 (-19.68%)
Bert As Language Modelbert as language model, fork from https://github.com/google-research/bert
Stars: ✭ 185 (-1.6%)