fuzzymaxCode for the paper: Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors, ICLR 2019.
Stars: ✭ 43 (-55.21%)
WordgcnACL 2019: Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks
Stars: ✭ 230 (+139.58%)
TextheroText preprocessing, representation and visualization from zero to hero.
Stars: ✭ 2,407 (+2407.29%)
simple elmoSimple library to work with pre-trained ELMo models in TensorFlow
Stars: ✭ 49 (-48.96%)
dasemDanish Semantic analysis
Stars: ✭ 17 (-82.29%)
JfasttextJava interface for fastText
Stars: ✭ 193 (+101.04%)
compress-fasttextTools for shrinking fastText models (in gensim format)
Stars: ✭ 124 (+29.17%)
sisterSImple SenTence EmbeddeR
Stars: ✭ 66 (-31.25%)
GensimTopic Modelling for Humans
Stars: ✭ 12,763 (+13194.79%)
HiCECode for ACL'19 "Few-Shot Representation Learning for Out-Of-Vocabulary Words"
Stars: ✭ 56 (-41.67%)
word-benchmarksBenchmarks for intrinsic word embeddings evaluation.
Stars: ✭ 45 (-53.12%)
Spanish Word EmbeddingsSpanish word embeddings computed with different methods and from different corpora
Stars: ✭ 236 (+145.83%)
Active-Explainable-ClassificationA set of tools for leveraging pre-trained embeddings, active learning and model explainability for effecient document classification
Stars: ✭ 28 (-70.83%)
Chameleon recsysSource code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (+110.42%)
S-WMDCode for Supervised Word Mover's Distance (SWMD)
Stars: ✭ 90 (-6.25%)
Vec4irWord Embeddings for Information Retrieval
Stars: ✭ 188 (+95.83%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-79.17%)
Sifrank zh基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
Stars: ✭ 175 (+82.29%)
keywordsextractkeywords-extract - Command line tool extract keywords from any web page.
Stars: ✭ 50 (-47.92%)
pair2vecpair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Stars: ✭ 62 (-35.42%)
Word2VecfJavaWord2VecfJava: Java implementation of Dependency-Based Word Embeddings and extensions
Stars: ✭ 14 (-85.42%)
Hash EmbeddingsPyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.
Stars: ✭ 126 (+31.25%)
Dna2vecdna2vec: Consistent vector representations of variable-length k-mers
Stars: ✭ 117 (+21.88%)
PromptPapersMust-read papers on prompt-based tuning for pre-trained language models.
Stars: ✭ 2,317 (+2313.54%)
FlairA very simple framework for state-of-the-art Natural Language Processing (NLP)
Stars: ✭ 11,065 (+11426.04%)
newtA web application to visualize and edit pathway models
Stars: ✭ 46 (-52.08%)
Pytorch Sentiment AnalysisTutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+3242.71%)
contextualLSTMContextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning
Stars: ✭ 28 (-70.83%)
KoanA word2vec negative sampling implementation with correct CBOW update.
Stars: ✭ 232 (+141.67%)
OpenPromptAn Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+1742.71%)
Question GenerationGenerating multiple choice questions from text using Machine Learning.
Stars: ✭ 227 (+136.46%)
word2vec-on-wikipediaA pipeline for training word embeddings using word2vec on wikipedia corpus.
Stars: ✭ 68 (-29.17%)
ShallowlearnAn experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (+104.17%)
ake-datasetsLarge, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
Stars: ✭ 125 (+30.21%)
GermanwordembeddingsToolkit to obtain and preprocess german corpora, train models using word2vec (gensim) and evaluate them with generated testsets
Stars: ✭ 189 (+96.88%)
PersianNERNamed-Entity Recognition in Persian Language
Stars: ✭ 48 (-50%)
Datastories Semeval2017 Task4Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Stars: ✭ 184 (+91.67%)
DebiasweRemove problematic gender bias from word embeddings.
Stars: ✭ 175 (+82.29%)
wefeWEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Stars: ✭ 164 (+70.83%)
LftmImproving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)
Stars: ✭ 168 (+75%)
MimickCode for Mimicking Word Embeddings using Subword RNNs (EMNLP 2017)
Stars: ✭ 152 (+58.33%)
Elmo TutorialA short tutorial on Elmo training (Pre trained, Training on new data, Incremental training)
Stars: ✭ 145 (+51.04%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (-37.5%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+1693.75%)
NLP-paper🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-76.04%)
two-stream-cnnA two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data
Stars: ✭ 24 (-75%)
position-rankPositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents
Stars: ✭ 89 (-7.29%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (-42.71%)
P-tuningA novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.
Stars: ✭ 593 (+517.71%)