Text mining resourcesResources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+1690%)
RmdlRMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+1775%)
Pyss3A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (+855%)
Artificial Adversary🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+1640%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+585%)
TextClassification基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+330%)
taller SparkRTaller SparkR para las Jornadas de Usuarios de R
Stars: ✭ 12 (-40%)
10kGNADTen Thousand German News Articles Dataset for Topic Classification
Stars: ✭ 63 (+215%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+25%)
small-textActive Learning for Text Classification in Python
Stars: ✭ 241 (+1105%)
FSCNMFAn implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-20%)
monkeylearn-phpOfficial PHP client for the MonkeyLearn API. Build and consume machine learning models for language processing from your PHP apps.
Stars: ✭ 47 (+135%)
automated-essay-gradingSource code for the paper A Memory-Augmented Neural Model for Automated Grading
Stars: ✭ 101 (+405%)
4chanMarkovTextText Generation using Markov Chains fed by 4chan APIs
Stars: ✭ 28 (+40%)
evineInteractive CLI Web Crawler
Stars: ✭ 140 (+600%)
kmeansA simple implementation of K-means (and Bisecting K-means) clustering algorithm in Python
Stars: ✭ 18 (-10%)
Naive-Resume-MatchingText Similarity Applied to resume, to compare Resumes with Job Descriptions and create a score to rank them. Similar to an ATS.
Stars: ✭ 27 (+35%)
NLP ToolkitLibrary of state-of-the-art models (PyTorch) for NLP tasks
Stars: ✭ 92 (+360%)
ml-in-productionThe practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (+45%)
textgoText preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (+65%)
HiGRUsImplementation of the paper "Hierarchical GRU for Utterance-level Emotion Recognition" in NAACL-2019.
Stars: ✭ 60 (+200%)
nlp classificationImplementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 224 (+1020%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (+535%)
hpipeWorkflow engine for various computing systems.
Stars: ✭ 26 (+30%)
NIDS-Intrusion-DetectionSimple Implementation of Network Intrusion Detection System. KddCup'99 Data set is used for this project. kdd_cup_10_percent is used for training test. correct set is used for test. PCA is used for dimension reduction. SVM and KNN supervised algorithms are the classification algorithms of project. Accuracy : %83.5 For SVM , %80 For KNN
Stars: ✭ 45 (+125%)
dayderSearch lots of data sets for spurious correlations
Stars: ✭ 44 (+120%)
DataEngineeringThis repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+135%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (+325%)
extremeTextLibrary for fast text representation and extreme classification.
Stars: ✭ 141 (+605%)
mpc-DL-controllerDeep Neural Network architecture as a predictive optimal controller for {HVAC+Solar cell + battery} disturbance afflicted system vs classic Model Predictive Control
Stars: ✭ 37 (+85%)
nsmc-zeppelin-notebookMovie review dataset Word2Vec & sentiment classification Zeppelin notebook
Stars: ✭ 26 (+30%)
pathpypathpy is an OpenSource python package for the modeling and analysis of pathways and temporal networks using higher-order and multi-order graphical models
Stars: ✭ 124 (+520%)
kasthack.ospГенератор сырых дампов пользователей VK.
Stars: ✭ 15 (-25%)
yunyi2018“云移杯- 景区口碑评价分值预测
Stars: ✭ 29 (+45%)
MetaLifelongLanguageRepository containing code for the paper "Meta-Learning with Sparse Experience Replay for Lifelong Language Learning".
Stars: ✭ 21 (+5%)
NSP-BERTThe code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (+730%)
BLUELAYSearches online paste sites for certain search terms which can indicate a possible data breach.
Stars: ✭ 24 (+20%)
NewsMTSCTarget-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k sentences and a state-of-the-art classification model.
Stars: ✭ 54 (+170%)
ganbert-pytorchEnhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace
Stars: ✭ 60 (+200%)
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (+120%)
Product-Categorization-NLPMulti-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (+50%)
DataCon🏆DataCon大数据安全分析大赛,2019年方向二(恶意代码检测)冠军源码、2020年方向五(恶意代码分析)季军源码
Stars: ✭ 69 (+245%)
genieGenie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)
Stars: ✭ 21 (+5%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+210%)
Hefei ECG TOP1“合肥高新杯”心电人机智能大赛 —— 心电异常事件预测 TOP1 Solution
Stars: ✭ 109 (+445%)