All Categories → Machine Learning → text-mining

Top 152 text-mining open source projects

Listed Company News Crawl And Text Analysis
从新浪财经、每经网、金融界、中国证券网、证券时报网上,爬取上市公司(个股)的历史新闻文本数据进行文本分析、提取特征集,然后利用SVM、随机森林等分类器进行训练,最后对实施抓取的新闻数据进行分类预测
Ldavis
R package for web-based interactive topic model visualization.
Awesome Sentiment Analysis
Repository with all what is necessary for sentiment analysis and related areas
Open Semantic Search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Rplos
R client for the PLoS Journals API
Textmining
Python文本挖掘系统 Research of Text Mining System
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
tg crawler
Just a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.
snorkeling
Extracting biomedical relationships from literature with Snorkel 🏊
TwEater
A Python Bot for Scraping Conversations from Twitter
eventextraction
中文复合事件抽取,能识别文本的模式,包括条件事件、顺承事件、反转事件等,可以用于文本逻辑性分析。
support-tickets-classification
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
elpresidente
🇺🇸 Search and Extract Corpus Elements from 'The American Presidency Project'
DaDengAndHisPython
【微信公众号:大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301, 我的联系邮箱[email protected]
aera-workshop
This workshop introduces participants to the Learning Analytics (LA), and provides a brief overview of LA methodologies, literature, applications, and ethical issues as they relate to STEM education.
named-entity-recognition
Notebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
textstem
Tools for fast text stemming & lemmatization
blueprints-text
Jupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
advanced-text-mining
TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다.
textdigester
TextDigester: document summarization java library
SparseLSH
A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
sacred
📖 Sacred texts in R
Guten-gutter
Strips boilerplate from Project Gutenberg text files
restaurant-finder-featureReviews
Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
civicmine
Text mining cancer biomarkers for the CIVIC database
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Adjutant
Runs a pubmed query, returns results and allows user to explore high-level structure of returned documents
R.TeMiS
R.TeMiS: R Text Mining Solution
thrones2vec
Using Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").
misinfo
📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)
VERSE
Vancouver Event and Relation System for Extraction
neji
Flexible and powerful platform for biomedical information extraction from text
tf-idf-python
Term frequency–inverse document frequency for Chinese novel/documents implemented in python.
61-120 of 152 text-mining projects