Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+1178.57%)
Lingua👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+1117.86%)
Advertoolsadvertools - online marketing productivity and analysis tools
Stars: ✭ 341 (+1117.86%)
DabData Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ
Stars: ✭ 294 (+950%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-3.57%)
DanmfA sparsity aware implementation of "Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection" (CIKM 2018).
Stars: ✭ 161 (+475%)
Customer satisfaction analysis基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。目的是克服用户打分和评论不一致,实时对在线民宿的满意度评测,包含在线评论采集和情感可视化分析。搭建了百度地图POI查询入口,可以进行自动化的批量查询 POI 信息的功能;构建了基于在线民宿语料的 LDA 自动主题聚类模型,利用主题中心词能找出对应的主题属性字典;以用户打分作为标注,然后 litNlp 自带的字符级 TextCNN 进行情感分析,将情感分类概率分布作为情感趋势,最后通过 POI 热力图的方式对不同地域的民宿满意度进行展示。软件版本请见链接。
Stars: ✭ 262 (+835.71%)
fairseq-tagginga Fairseq fork for sequence tagging/labeling tasks
Stars: ✭ 26 (-7.14%)
nlp-qrmine🔦 Qualitative Research support tools in Python
Stars: ✭ 28 (+0%)
kserveServerless Inferencing on Kubernetes
Stars: ✭ 1,621 (+5689.29%)
sent2vecHow to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
Stars: ✭ 99 (+253.57%)
Fenchel Young LossesProbabilistic classification in PyTorch/TensorFlow/scikit-learn with Fenchel-Young losses
Stars: ✭ 152 (+442.86%)
Natural-Language-ProcessingContains various architectures and novel paper implementations for Natural Language Processing tasks like Sequence Modelling and Neural Machine Translation.
Stars: ✭ 48 (+71.43%)
xpandasUniversal 1d/2d data containers with Transformers functionality for data analysis.
Stars: ✭ 25 (-10.71%)
nlp newsletterNatural language processing (NLP) newsletter right on GitHub
Stars: ✭ 57 (+103.57%)
SignalSimple and beautiful open source Analytics 📊
Stars: ✭ 295 (+953.57%)
bagging puSimple sklearn based python implementation of Positive-Unlabeled (PU) classification using bagging based ensembles
Stars: ✭ 73 (+160.71%)
easyNLPDo NLP without coding!
Stars: ✭ 19 (-32.14%)
Data Analysis主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (+407.14%)
ChatbotA Deep-Learning multi-purpose chatbot made using Python3
Stars: ✭ 36 (+28.57%)
resolutions-2019A list of data mining and machine learning papers that I implemented in 2019.
Stars: ✭ 19 (-32.14%)
NTUA-slp-nlp💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-32.14%)
Flask-Deep-Learning-NLP-APIFlask API to productize a document classification model. Classification model was built using Keras with tensorflow backend
Stars: ✭ 26 (-7.14%)
KMeans elbowCode for determining optimal number of clusters for K-means algorithm using the 'elbow criterion'
Stars: ✭ 35 (+25%)
memologyMemes - why so popular?
Stars: ✭ 32 (+14.29%)
Qlik Py ToolsData Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Stars: ✭ 135 (+382.14%)
sklearn-pmml-modelA library to parse and convert PMML models into Scikit-learn estimators.
Stars: ✭ 71 (+153.57%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+110.71%)
lucene-geo-gazetteerUses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
Stars: ✭ 34 (+21.43%)
exemplary-ml-pipelineExemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-17.86%)
lingua-go👄 The most accurate natural language detection library for Go, suitable for long and short text alike
Stars: ✭ 684 (+2342.86%)
Role2vecA scalable Gensim implementation of "Learning Role-based Graph Embeddings" (IJCAI 2018).
Stars: ✭ 134 (+378.57%)
Hutoma-Conversational-AI-PlatformHu:toma AI is an open source stack designed to help you create compelling conversational interfaces with little effort and above industry accuracy
Stars: ✭ 35 (+25%)
scibloxsciblox - Easier Data Science and Machine Learning
Stars: ✭ 48 (+71.43%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (-28.57%)
MatomoLiberating Web Analytics. Star us on Github? +1. Matomo is the leading open alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. We love Pull Requests!
Stars: ✭ 15,711 (+56010.71%)
lingvo--Ner-ruNamed entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+35.71%)
scikitcrf NERPython library for custom entity recognition using Sklearn CRF
Stars: ✭ 17 (-39.29%)
Quora question pairs NLP KaggleQuora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training
Stars: ✭ 17 (-39.29%)
get smartiesDummy variable generation with fit/transform capabilities
Stars: ✭ 23 (-17.86%)
alter-nluNatural language understanding library for chatbots with intent recognition and entity extraction.
Stars: ✭ 45 (+60.71%)
nextly-templateNextly Landing Page Template built with Next.js & TailwindCSS
Stars: ✭ 48 (+71.43%)
mmetricsEasy computation of Marketing Metrics in R
Stars: ✭ 26 (-7.14%)
imbalanced-ensembleClass-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+610.71%)
Awesome Nlp PolishA curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.
Stars: ✭ 153 (+446.43%)