An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

Stars: ✭ 196 (+308.33%)

Mutual labels: text-mining, word2vec, word-embeddings

Chameleon recsys

Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems

Stars: ✭ 202 (+320.83%)

Mutual labels: word2vec, word-embeddings, lstm

NMFADMM

A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).

Stars: ✭ 39 (-18.75%)

Mutual labels: word2vec, lda, word-embedding

Texthero

Text preprocessing, representation and visualization from zero to hero.

Stars: ✭ 2,407 (+4914.58%)

Mutual labels: text-mining, word-embeddings

two-stream-cnn

A two-stream convolutional neural network for learning abitrary similarity functions over two sets of training data

Stars: ✭ 24 (-50%)

Mutual labels: word2vec, word-embeddings

Text-Classification-LSTMs-PyTorch

The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.

Stars: ✭ 45 (-6.25%)

Mutual labels: text-mining, text-processing

Xioc

Extract indicators of compromise from text, including "escaped" ones.

Stars: ✭ 148 (+208.33%)

Mutual labels: text-mining, text-processing

Textfeatures

👷‍♂️ A simple package for extracting useful features from character objects 👷‍♀️

Stars: ✭ 148 (+208.33%)

Mutual labels: text-mining, word2vec

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (+195.83%)

Mutual labels: text-mining, text-processing

Aravec

AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

Stars: ✭ 239 (+397.92%)

Mutual labels: text-mining, word2vec

Sequence-Models-coursera

Sequence Models by Andrew Ng on Coursera. Programming Assignments and Quiz Solutions.

Stars: ✭ 53 (+10.42%)

Mutual labels: lstm, word-embedding

Emotion-recognition-from-tweets

A comprehensive approach on recognizing emotion (sentiment) from a certain tweet. Supervised machine learning.

Stars: ✭ 17 (-64.58%)

Mutual labels: word2vec, text-processing

word2vec-on-wikipedia

A pipeline for training word embeddings using word2vec on wikipedia corpus.

Stars: ✭ 68 (+41.67%)

Mutual labels: word2vec, word-embeddings

Arabic-Word-Embeddings-Word2vec

Arabic Word Embeddings Word2vec

Stars: ✭ 26 (-45.83%)

Mutual labels: word2vec, word-embeddings

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+1381.25%)

Mutual labels: text-mining, web-scraping

tomoto-ruby

High performance topic modeling for Ruby

Stars: ✭ 49 (+2.08%)

Mutual labels: lda, latent-dirichlet-allocation

NLP-paper

🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/

Stars: ✭ 23 (-52.08%)

Mutual labels: word2vec, lda

sarcasm-detection-for-sentiment-analysis

Sarcasm Detection for Sentiment Analysis

Stars: ✭ 21 (-56.25%)

Mutual labels: word2vec, lstm

dnn-lstm-word-segment

Chinese Word Segmention Base on the Deep Learning and LSTM Neural Network

Stars: ✭ 24 (-50%)

Mutual labels: word2vec, lstm

TRUNAJOD2.0

An easy-to-use library to extract indices from texts.

Stars: ✭ 18 (-62.5%)

Mutual labels: text-mining, text-processing

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-68.75%)

Mutual labels: web-scraping, scraping-websites

Cogcomp Nlpy

CogComp's light-weight Python NLP annotators

Stars: ✭ 115 (+139.58%)

Mutual labels: text-mining, text-processing

Textcluster

短文本聚类预处理模块 Short text cluster

Stars: ✭ 115 (+139.58%)

Mutual labels: text-mining, text-processing

wikidata-corpus

Train Wikidata with word2vec for word embedding tasks

Stars: ✭ 109 (+127.08%)

Mutual labels: word2vec, word-embeddings

text-analysis

Weaving analytical stories from text data

Stars: ✭ 12 (-75%)

Mutual labels: text-mining, text-processing

Simple-Sentence-Similarity

Exploring the simple sentence similarity measurements using word embeddings

Stars: ✭ 99 (+106.25%)

Mutual labels: word2vec, word-embeddings

perke

A keyphrase extractor for Persian

Stars: ✭ 60 (+25%)

Mutual labels: text-mining, text-processing

Text predictor

Char-level RNN LSTM text generator📄.

Stars: ✭ 99 (+106.25%)

Mutual labels: text-mining, lstm

teanaps

자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.

Stars: ✭ 91 (+89.58%)

Mutual labels: text-mining, text-processing

estratto

parsing fixed width files content made easy

Stars: ✭ 12 (-75%)

Mutual labels: text-mining, text-processing

word2vec-pytorch

Extremely simple and fast word2vec implementation with Negative Sampling + Sub-sampling

Stars: ✭ 145 (+202.08%)

Mutual labels: word2vec, skipgram

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-66.67%)

Mutual labels: text-mining, text-processing

learningspoons

nlp lecture-notes and source code

Stars: ✭ 29 (-39.58%)

Mutual labels: word2vec, lstm

word-benchmarks

Benchmarks for intrinsic word embeddings evaluation.

Stars: ✭ 45 (-6.25%)

Mutual labels: word2vec, word-embeddings

Lda Topic Modeling

A PureScript, browser-based implementation of LDA topic modeling.

Stars: ✭ 91 (+89.58%)

Mutual labels: text-mining, lda

SentimentAnalysis

(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset

Stars: ✭ 40 (-16.67%)

Mutual labels: word2vec, lstm

sentiment-analysis-of-tweets-in-russian

Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.

Stars: ✭ 51 (+6.25%)

Mutual labels: word2vec, word-embeddings

models-by-example

By-hand code for models and algorithms. An update to the 'Miscellaneous-R-Code' repo.

Stars: ✭ 43 (-10.42%)

Mutual labels: expectation-maximization, gradient-descent

datastories-semeval2017-task6

Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".

Stars: ✭ 20 (-58.33%)

Mutual labels: word-embeddings, lstm

extractnet

A Dragnet that also extract author, headline, date, keywords from context

Stars: ✭ 52 (+8.33%)

Mutual labels: text-mining, web-scraping

walklets

A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).

Stars: ✭ 94 (+95.83%)

Mutual labels: word2vec, word-embedding

JoSH

[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Stars: ✭ 55 (+14.58%)

Mutual labels: text-mining, word-embeddings

word2vec-tsne

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.

Stars: ✭ 59 (+22.92%)

Mutual labels: word2vec, word-embeddings

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (-56.25%)

Mutual labels: text-mining, web-scraping

SentimentAnalysis

Sentiment Analysis: Deep Bi-LSTM+attention model

Stars: ✭ 32 (-33.33%)

Mutual labels: word-embeddings, lstm

TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Stars: ✭ 27 (-43.75%)

Mutual labels: text-mining, text-processing

fsauor2018

基于LSTM网络与自注意力机制对中文评论进行细粒度情感分析