KateCode & data accompanying the KDD 2017 paper "KATE: K-Competitive Autoencoder for Text"
Stars: ✭ 135 (-71.03%)
Learning Social Media Analytics With RThis repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt
Stars: ✭ 102 (-78.11%)
BigartmFast topic modeling platform
Stars: ✭ 563 (+20.82%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (-80.47%)
kwxBERT, LDA, and TFIDF based keyword extraction in Python
Stars: ✭ 33 (-92.92%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+269.53%)
Text2vecFast vectorization, topic modeling, distances and GloVe word embeddings in R.
Stars: ✭ 715 (+53.43%)
JoSH[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
Stars: ✭ 55 (-88.2%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (-68.45%)
text-analysisWeaving analytical stories from text data
Stars: ✭ 12 (-97.42%)
Lda Topic ModelingA PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-80.47%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-94.21%)
Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (-23.18%)
ruimteholR package to Embed All the Things! using StarSpace
Stars: ✭ 95 (-79.61%)
LdaLDA topic modeling for node.js
Stars: ✭ 262 (-43.78%)
aera-workshopThis workshop introduces participants to the Learning Analytics (LA), and provides a brief overview of LA methodologies, literature, applications, and ethical issues as they relate to STEM education.
Stars: ✭ 14 (-97%)
TAKGThe official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"
Stars: ✭ 127 (-72.75%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (-31.76%)
sensimSentence Similarity Estimator (SenSim)
Stars: ✭ 15 (-96.78%)
blueprints-textJupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Stars: ✭ 103 (-77.9%)
Text-AnalysisExplaining textual analysis tools in Python. Including Preprocessing, Skip Gram (word2vec), and Topic Modelling.
Stars: ✭ 48 (-89.7%)
textdigesterTextDigester: document summarization java library
Stars: ✭ 23 (-95.06%)
DaDengAndHisPython【微信公众号:大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301, 我的联系邮箱
[email protected] Stars: ✭ 59 (-87.34%)
NlpythonThis repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (-43.13%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (-25.32%)
named-entity-recognitionNotebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
Stars: ✭ 18 (-96.14%)
NMFADMMA sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (-91.63%)
Guidedldasemi supervised guided topic model with custom guidedLDA
Stars: ✭ 390 (-16.31%)
textstemTools for fast text stemming & lemmatization
Stars: ✭ 36 (-92.27%)
tg crawlerJust a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.
Stars: ✭ 71 (-84.76%)
GraphbrainLanguage, Knowledge, Cognition
Stars: ✭ 294 (-36.91%)
snorkelingExtracting biomedical relationships from literature with Snorkel 🏊
Stars: ✭ 56 (-87.98%)
ipo-minerIPO Investment via Text Mining.
Stars: ✭ 20 (-95.71%)
gofastrMake a DocumentTermMatrix faster
Stars: ✭ 19 (-95.92%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (-72.75%)
Corex topicHierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Stars: ✭ 439 (-5.79%)
Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Stars: ✭ 386 (-17.17%)
RplosR client for the PLoS Journals API
Stars: ✭ 289 (-37.98%)
topicAppA simple Shiny App for Topic Modeling in R
Stars: ✭ 40 (-91.42%)
sacred📖 Sacred texts in R
Stars: ✭ 19 (-95.92%)
Guten-gutterStrips boilerplate from Project Gutenberg text files
Stars: ✭ 16 (-96.57%)
pydataberlin-2017Repo for my talk at the PyData Berlin 2017 conference
Stars: ✭ 63 (-86.48%)
Twitter-TrendsTwitter Trends is a web-based application that automatically detects and analyzes emerging topics in real time through hashtags and user mentions in tweets. Twitter being the major microblogging service is a reliable source for trends detection. The project involved extracting live streaming tweets, processing them to find top hashtags and user …
Stars: ✭ 82 (-82.4%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-93.56%)
Textractextract text from any document. no muss. no fuss.
Stars: ✭ 3,165 (+579.18%)
TwEaterA Python Bot for Scraping Conversations from Twitter
Stars: ✭ 16 (-96.57%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-95.49%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-94.21%)
RmdlRMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (-19.53%)
tassalTree-based Autofolding Software Summarization Algorithm
Stars: ✭ 38 (-91.85%)
eventextraction中文复合事件抽取,能识别文本的模式,包括条件事件、顺承事件、反转事件等,可以用于文本逻辑性分析。
Stars: ✭ 17 (-96.35%)