An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

Stars: ✭ 196 (+308.33%)

Mutual labels: text-mining, word2vec, word-embeddings

NMFADMM

A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).

Stars: ✭ 39 (-18.75%)

Mutual labels: word2vec, lda, word-embedding

models-by-example

By-hand code for models and algorithms. An update to the 'Miscellaneous-R-Code' repo.

Stars: ✭ 43 (-10.42%)

Mutual labels: expectation-maximization, gradient-descent

NTUA-slp-nlp

💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA

Stars: ✭ 19 (-60.42%)

Mutual labels: word2vec, word-embeddings

fsauor2018

基于LSTM网络与自注意力机制对中文评论进行细粒度情感分析

Stars: ✭ 36 (-25%)

Mutual labels: word2vec, lstm

sentiment-analysis-of-tweets-in-russian

Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.

Stars: ✭ 51 (+6.25%)

Mutual labels: word2vec, word-embeddings

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (+195.83%)

Mutual labels: text-mining, text-processing

wikidata-corpus

Train Wikidata with word2vec for word embedding tasks

Stars: ✭ 109 (+127.08%)

Mutual labels: word2vec, word-embeddings

TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Stars: ✭ 27 (-43.75%)

Mutual labels: text-mining, text-processing

SentimentAnalysis

Sentiment Analysis: Deep Bi-LSTM+attention model

Stars: ✭ 32 (-33.33%)

Mutual labels: word-embeddings, lstm

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (-56.25%)

Mutual labels: text-mining, web-scraping

SWDM

SIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model

Stars: ✭ 35 (-27.08%)

Mutual labels: word2vec, word-embeddings

word embedding

Sample code for training Word2Vec and FastText using wiki corpus and their pretrained word embedding..

Stars: ✭ 21 (-56.25%)

Mutual labels: word2vec, word-embeddings

View All Similar Projects ➔

Text-Analysis

This is not a module for large scale use, but rather a set of scripts to explain popular methodologies in text analysis, including Web Scraping, Preprocessing, Skip Gram (word2vec), and Topic Modelling.

1. Web Scraping

How can I download text data from a website algorithmically using Python? How do I store the data in a csv file for later use?

Web_Scraping.py: explains how to download movie quotes and store the data neatly in a table using the Pandas Python module.

2. Preprocessing

How are documents and words represented in Python? How can I clean text in Python by removing unnecessary words and adjusting for infrequent words?

Text_Preprocessing.py: explains common ways of representing text data in Python through one-hot encoded vectors, cleaning data with removal of stopwords and lowercasing, and TF-IDF weights.

3. EM-Algorithm

How can I discover topics of documents? I.e. how can I calculate how much one article is about sports, another about business, etc.?

EM_Algorithm.py: explains how to estimate a distribution using the EM-Algorithm. This is a precursor to the topic modelling example.

4. Gibbs Sampling

How can I discover topics of documents? I.e. how can I calculate how much one article is about sports, another about business, etc.?

Gibbs_Sampling.py: explains how Gibbs sampling works in the context of topic modelling.

5. Skip Gram

How can I find which words in my documents are related to each other syntactically and semantically? How does a basic neural network work?

Skip_Gram.py: explains how the Skip Gram model from Mikolov et al. works (with gradient descent and no negative sampling).

6. Long Short Term Memory

How can I develop a language model with memory? How does backpropogation/gradient descent through time work?

LSTM_Tutorial.py: explains the backpropogation of an LSTM model. Extends the code from Nicolas Jimenez to train a language model with memory.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

pesoto / Text-Analysis

Programming Languages

Labels

Projects that are alternatives of or similar to Text-Analysis

Text-Analysis

1. Web Scraping

2. Preprocessing

3. EM-Algorithm

4. Gibbs Sampling

5. Skip Gram

6. Long Short Term Memory