kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-44.07%)
Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+506.78%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+489.83%)
support-tickets-classificationThis case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Stars: ✭ 142 (+140.68%)
GraphbrainLanguage, Knowledge, Cognition
Stars: ✭ 294 (+398.31%)
MetaA Modern C++ Data Sciences Toolkit
Stars: ✭ 600 (+916.95%)
Whatlang RsNatural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Stars: ✭ 400 (+577.97%)
TextvecText vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (+183.05%)
ShallowlearnAn experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (+232.2%)
Nlp In PracticeStarter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+1238.98%)
Orange3 Text🍊 📄 Text Mining add-on for Orange3
Stars: ✭ 83 (+40.68%)
QdapQuantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
Stars: ✭ 146 (+147.46%)
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (+223.73%)
Text-Classification-LSTMs-PyTorchThe aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (-23.73%)
RmdlRMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+535.59%)
Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (+554.24%)
R Text DataList of textual data sources to be used for text mining in R
Stars: ✭ 85 (+44.07%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-79.66%)
clustextEasy, fast clustering of texts
Stars: ✭ 18 (-69.49%)
nlpbuddyA text analysis application for performing common NLP tasks through a web dashboard interface and an API
Stars: ✭ 115 (+94.92%)
Pyss3A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (+223.73%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (-18.64%)
TRUNAJOD2.0An easy-to-use library to extract indices from texts.
Stars: ✭ 18 (-69.49%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+115.25%)
blueprints-textJupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Stars: ✭ 103 (+74.58%)
sacred📖 Sacred texts in R
Stars: ✭ 19 (-67.8%)
extremeTextLibrary for fast text representation and extreme classification.
Stars: ✭ 141 (+138.98%)
aera-workshopThis workshop introduces participants to the Learning Analytics (LA), and provides a brief overview of LA methodologies, literature, applications, and ethical issues as they relate to STEM education.
Stars: ✭ 14 (-76.27%)
occupationcoderGiven a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
Stars: ✭ 30 (-49.15%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (-54.24%)
Guten-gutterStrips boilerplate from Project Gutenberg text files
Stars: ✭ 16 (-72.88%)
ritaWebsite, documentation and examples for RiTa
Stars: ✭ 42 (-28.81%)
Giveme5WExtraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-72.88%)
monkeylearn-javaOfficial Java client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Java apps.
Stars: ✭ 23 (-61.02%)
MetaLifelongLanguageRepository containing code for the paper "Meta-Learning with Sparse Experience Replay for Lifelong Language Learning".
Stars: ✭ 21 (-64.41%)
YelpDatasetSQLWorking with the Yelp Dataset in Azure SQL and SQL Server
Stars: ✭ 16 (-72.88%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-49.15%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-64.41%)
named-entity-recognitionNotebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
Stars: ✭ 18 (-69.49%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+5.08%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-54.24%)
textdigesterTextDigester: document summarization java library
Stars: ✭ 23 (-61.02%)
MetaCatMinimally Supervised Categorization of Text with Metadata (SIGIR'20)
Stars: ✭ 52 (-11.86%)
text-classification-svmThe missing SVM-based text classification module implementing HanLP's interface
Stars: ✭ 46 (-22.03%)
synaptic-simple-trainerA ready to go text classification trainer based on synaptic (https://github.com/cazala/synaptic)
Stars: ✭ 19 (-67.8%)