ShifteratorInterpretable data visualizations for understanding how texts differ at the word level
Woke✊ Detect non-inclusive language in your source code.
TextvecText vectorization tool to outperform TFIDF for classification tasks
TextcleanTools for cleaning and normalizing text data
Applied MlCode and Resources for "Applied Machine Learning"
QdapQuantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
Stanza OldStanford NLP group's shared Python tools.
SmltarManuscript of the book "Supervised Machine Learning for Text Analysis in R" by Emil Hvitfeldt and Julia Silge
Ml Dl ScriptsThe repository provides usefull python scripts for ML and data analysis
R Text DataList of textual data sources to be used for text mining in R
OreAn R interface to the Onigmo regular expression library
BiomedicusCode for the old version of BioMedICUS, for the new version see the biomedicus3 repository.
DoctopicsVarious examples of topic modeling and other text analysis
RezonatorRezonator: Dynamics of human engagement
ArticleparseHeuristic text extraction from news sites in Python3
HomerHomer, a text analyser in Python, can help make your text more clear, simple and useful for your readers.
MetaA Modern C++ Data Sciences Toolkit
Php Text AnalysisPHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
Whatlang RsNatural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
JekyllJekyll-based static site for The Programming Historian
Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Python CourseTutorial and introduction into programming with Python for the humanities and social sciences
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Giveme5w1hExtraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
TextpipeTextpipe: clean and extract metadata from text
kwxBERT, LDA, and TFIDF based keyword extraction in Python
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
LSXA word embeddings-based semi-supervised model for document scaling
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
YelpDatasetSQLWorking with the Yelp Dataset in Azure SQL and SQL Server
occupationcoderGiven a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
ritaWebsite, documentation and examples for RiTa
HurdleDMR.jlHurdle Distributed Multinomial Regression (HDMR) implemented in Julia
learning-stmLearning structural topic modeling using the stm R package.
aylien textapi goAYLIEN's officially supported Go client library for accessing Text API
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
nlpbuddyA text analysis application for performing common NLP tasks through a web dashboard interface and an API
rectr💒 Reproducible Extraction of Cross-lingual Topics using R
big-data-upfRECSM-UPF Summer School: Social Media and Big Data Research