All Categories → Machine Learning → text-mining

Top 152 text-mining open source projects

AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

✭ 239

jupyter-notebook nlp word2vec text-mining gensim arabic

Gwu data mining

Materials for GWU DNSC 6279 and DNSC 6290.

✭ 217

python r jupyter-notebook machine-learning data-science image-processing data-visualization data-mining text-mining image-recognition h2o

Cnn Text Classification Keras

Text Classification by Convolutional Neural Network in Keras

✭ 213

python deep-learning tensorflow nlp keras cnn text-classification sentiment-analysis text-mining theano

Qminer

Analytic platform for real-time large-scale streams containing structured and unstructured data.

✭ 206

javascript cpp machine-learning data-mining text-mining signal-processing

Shallowlearn

An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

✭ 196

python machine-learning neural-network scikit-learn text-classification word2vec text-mining word-embeddings supervised-learning fasttext gensim online-learning

Fake news detection

Fake News Detection in Python

✭ 194

python classification text-classification text-mining logistic-regression text-analysis

Pyss3

A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI

✭ 191

python machine-learning nlp natural-language-processing artificial-intelligence text-classification data-mining machine-learning-algorithms text-mining interpretability

Hdltex

HDLTex: Hierarchical Deep Learning for Text Classification

✭ 191

python deep-learning tensorflow deep-neural-networks convolutional-neural-networks dataset gpu text-classification recurrent-neural-networks text-mining information-retrieval

Breadability

Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)

✭ 186

python html text-mining text-extraction

Texthero

Text preprocessing, representation and visualization from zero to hero.

✭ 2,407

python javascript CSS machine-learning nlp text-mining word-embeddings text-clustering text-visualization text-representation text-preprocessing nlp-pipeline texthero

Nlp profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

✭ 181

jupyter-notebook hacktoberfest nlp natural-language-processing jupyter profiler nlp-machine-learning text-mining profiling nlp-library

Multi rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

✭ 162

python nlp text-mining

Tokenizers

Fast, Consistent Tokenization of Natural Language Text

✭ 161

r nlp rstats r-package text-mining tokenizer

Lazynlp

Library to scrape and clean web pages to create massive datasets.

✭ 1,985

python nlp data-science natural-language-processing artificial-intelligence language-model text-mining open

Udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

✭ 160

r nlp natural-language-processing r-package text-mining tokenizer pos-tagging

Awesome Text Classification

Awesome-Text-Classification Projects,Papers,Tutorial .

✭ 158

tensorflow awesome nlp classification text-classification text-mining text-analysis nltk

Awesome Nlp

📖 A curated list of resources dedicated to Natural Language Processing (NLP)

✭ 12,626

deep-learning machine-learning awesome awesome-list nlp natural-language-processing text-mining

Chemdataextractor

Automatically extract chemical information from scientific documents

✭ 152

python nlp natural-language-processing text-mining chemistry information-extraction

Textfeatures

👷‍♂️ A simple package for extracting useful features from character objects 👷‍♀️

✭ 148

r machine-learning neural-network neural-networks rstats word2vec text-mining feature-extraction

Xioc

Extract indicators of compromise from text, including "escaped" ones.

✭ 148

go command-line command-line-tool data-mining regex ioc text-mining extract text-processing regexp extraction

Qdap

Quantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis

✭ 146

r text-mining text-analysis

Hands On Natural Language Processing With Python

This repository is for my students of Udemy. You can find all lecture codes along with mentioned files for reading in here. So, feel free to clone it and if you have any problem just raise a question.

✭ 146

python natural-language-processing nlp-machine-learning text-mining

Kate

Code & data accompanying the KDD 2017 paper "KATE: K-Competitive Autoencoder for Text"

✭ 135

python deep-learning autoencoder text-mining representation-learning topic-modeling

Datasciencer

a curated list of R tutorials for Data Science, NLP and Machine Learning

✭ 1,727

r Rebol data-science text-mining datascience

Khcoder

KH Coder: for Quantitative Content Analysis or Text Mining

✭ 126

perl visualization text-mining corpus

Awesome Hungarian Nlp

A curated list of NLP resources for Hungarian

✭ 121

awesome awesome-list nlp natural-language-processing parser dataset named-entity-recognition text-mining information-retrieval natural-language-understanding nlu corpus information-extraction

Keywords2vec

✭ 121

jupyter-notebook nlp text-mining multi-language

Scattertext

Beautiful visualizations of how language differs among document types.

Cogcomp Nlpy

CogComp's light-weight Python NLP annotators

✭ 115

python nlp natural-language-processing data-mining text-mining text-processing

Textcluster

短文本聚类预处理模块 Short text cluster

✭ 115

python nlp cluster text-mining text-processing

Genius

Easily access song lyrics from Genius in a tibble.

✭ 111

r text-mining music-information-retrieval

Learning Social Media Analytics With R

This repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt

✭ 102

r github twitter analytics facebook sentiment-analysis news ggplot2 text-mining social-media topic-modeling stackoverflow

Text predictor

Char-level RNN LSTM text generator📄.

✭ 99

python deep-learning machine-learning artificial-intelligence ai lstm rnn text-mining lstm-neural-networks

Lda Topic Modeling

A PureScript, browser-based implementation of LDA topic modeling.

✭ 91

purescript machine-learning nlp data-science natural-language-processing functional-programming reactive clustering machine-learning-algorithms reactive-programming nlp-machine-learning text-mining bulma bayesian topic-modeling lda

Lexicon

A data package containing lexicons and dictionaries for text analysis

✭ 87

r hash text-mining

R Text Data

List of textual data sources to be used for text mining in R

✭ 85

nlp data-science rstats text-mining text-analysis

Orange3 Text

🍊 📄 Text Mining add-on for Orange3

✭ 83

python hacktoberfest twitter text sentiment-analysis text-mining text-analysis nltk

Python nlp tutorial

This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)

✭ 72

python jupyter-notebook nlp natural-language-processing research text-mining spacy nltk

Pyphonetics

A Python 3 phonetics library.

✭ 61

python nlp text-mining

How To Mine Newsfeed Data And Extract Interactive Insights In Python

A practical guide to topic mining and interactive visualizations

✭ 61

html nlp natural-language-processing nlp-machine-learning text-mining sklearn crontab topic-modeling gensim tf-idf kmeans plots

Applied Text Mining In Python

Repo for Applied Text Mining in Python (coursera) by University of Michigan

✭ 59

python jupyter-notebook nlp classification pandas text-classification regex text-mining text-processing

Konlpy

Python package for Korean natural language processing.

✭ 1,098

python hacktoberfest nlp text-mining korean

Pipeit

PipeIt is a text transformation, conversion, cleansing and extraction tool.

✭ 57

go text-mining text-processing

Ngram

Fast n-Gram Tokenization

✭ 55

c r text text-mining

Spark Nkp

Natural Korean Processor for Apache Spark

✭ 50

scala nlp natural-language-processing spark apache-spark text-mining

Tadw

An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).

✭ 43

python machine-learning awesome data-science data-mining unsupervised-learning word2vec text-mining matrix-factorization gensim

Friend.ly

A social media platform with a friend recommendation engine based on personality trait extraction

✭ 41

javascript deep-learning nodejs api nlp webrtc oauth2 mongoose social-network social text-to-speech text-mining ejs social-login passportjs nodemailer

Gsoc2018 3gm

💫 Automated codification of Greek Legislation with NLP

✭ 36

python python3 nlp automation natural-language-processing text-mining natural-language-understanding

Tidytext

Text mining using tidy tools ✨📄✨

✭ 975

r natural-language-processing text-mining tidyverse

Metasra Pipeline

MetaSRA: normalized sample-specific metadata for the Sequence Read Archive

✭ 33

python natural-language-processing bioinformatics data-mining text-mining annotation-processor biology

Uc Davis Cs Exams Analysis

📈 Regression and Classification with UC Davis student quiz data and exam data

✭ 33

r machine-learning nlp testing statistics regex unsupervised-learning training text-mining web-scraping logistic-regression linear-regression probability statistical-analysis

Tidy Text Mining

Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson

✭ 961

r tex book text-mining tidyverse bookdown

Nlppln

NLP pipeline software using common workflow language

✭ 31

python nlp workflow pipeline text-mining

Spider

A configurable web spider with a easy-to-use web console

✭ 954

java spider text-mining

Text Mining

Text Mining in Python

✭ 18

python jupyter-notebook text-classification text-mining text-processing

Bagofconcepts

Python implementation of bag-of-concepts

✭ 18

python machine-learning clustering unsupervised-learning word2vec text-mining representation-learning

Autophrase

AutoPhrase: Automated Phrase Mining from Massive Text Corpora

✭ 835

text-mining automatic multi-language

Rake Nltk

Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.

✭ 793

python algorithm text-mining nltk

Nlp In Practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

✭ 790

jupyter-notebook machine-learning nlp natural-language-processing text-classification word2vec text-mining gensim tf-idf

Text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

✭ 715

r natural-language-processing word2vec text-mining word-embeddings topic-modeling glove vectorization

1-60 of 152 text-mining projects

›