All Categories → Machine Learning → text-mining

Top 152 text-mining open source projects

textlearnR
A simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
textreadr
Tools to uniformly read in text data including semi-structured transcripts
JoSH
[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
odinson
Odinson is a powerful and highly optimized open-source framework for rule-based information extraction. Odinson couples a simple, yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time.
Search
Blue Brain text mining toolbox for semantic search and structured information extraction
malay-dataset
Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
PubMed-Best-Match
Machine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
estratto
parsing fixed width files content made easy
woolly
The Text Mining Elixir
sentometrics
An integrated framework in R for textual sentiment time series aggregation and prediction
crminer
⛔ ARCHIVED ⛔ Fetch 'Scholary' Full Text from 'Crossref'
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
palladian
Palladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Answerable
Recommendation system for Stack Overflow unanswered questions
readability
Fast readability scores for text data
koshort
(deprecated) 🐱 koshort is a Python package for Korean internet spoken language crawling and processing... or maybe Korean domestic cat.
121-152 of 152 text-mining projects