All Projects → anirudhshenoy → text-classification-small-datasets

anirudhshenoy / text-classification-small-datasets

Licence: other
Building a text classifier with extremely small datasets

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to text-classification-small-datasets

Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+6111.76%)
Mutual labels:  text-classification, datasets
PharmacoGx
R package to analyze large-scale pharmacogenomic datasets.
Stars: ✭ 42 (+23.53%)
Mutual labels:  datasets
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-11.76%)
Mutual labels:  text-classification
textgo
Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
Stars: ✭ 33 (-2.94%)
Mutual labels:  text-classification
MetaLifelongLanguage
Repository containing code for the paper "Meta-Learning with Sparse Experience Replay for Lifelong Language Learning".
Stars: ✭ 21 (-38.24%)
Mutual labels:  text-classification
NewsMTSC
Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k sentences and a state-of-the-art classification model.
Stars: ✭ 54 (+58.82%)
Mutual labels:  text-classification
MetaCat
Minimally Supervised Categorization of Text with Metadata (SIGIR'20)
Stars: ✭ 52 (+52.94%)
Mutual labels:  text-classification
Data-Science-and-Machine-Learning-Resources
List of Data Science and Machine Learning Resource that I frequently use
Stars: ✭ 19 (-44.12%)
Mutual labels:  datasets
systematic-review-datasets
A collection of fully labeled systematic review datasets (title-abstract screening)
Stars: ✭ 25 (-26.47%)
Mutual labels:  datasets
allie
🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (+173.53%)
Mutual labels:  datasets
extremeText
Library for fast text representation and extreme classification.
Stars: ✭ 141 (+314.71%)
Mutual labels:  text-classification
PharmacoDB
Search across publicly available datasets to find instances where a drug or cell line of interest has been profiled.
Stars: ✭ 38 (+11.76%)
Mutual labels:  datasets
extra keras datasets
📃🎉 Additional datasets for tensorflow.keras
Stars: ✭ 20 (-41.18%)
Mutual labels:  datasets
classification
Vietnamese Text Classification
Stars: ✭ 39 (+14.71%)
Mutual labels:  text-classification
HiGitClass
HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories (ICDM'19)
Stars: ✭ 58 (+70.59%)
Mutual labels:  text-classification
WSDM-Cup-2019
[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (+82.35%)
Mutual labels:  text-classification
AIODrive
Official Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (-5.88%)
Mutual labels:  datasets
Caver
Caver: a toolkit for multilabel text classification.
Stars: ✭ 38 (+11.76%)
Mutual labels:  text-classification
Python-for-Text-Classification
Python for Text Classification with Machine Learning in Python 3.6.
Stars: ✭ 32 (-5.88%)
Mutual labels:  text-classification
NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"
Stars: ✭ 166 (+388.24%)
Mutual labels:  text-classification

Text Classification With Extremely Small Datasets

Accompanying blog : https://towardsdatascience.com/text-classification-with-extremely-small-datasets-333d322caee2

Credits:

  1. Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. "Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media”. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Fransisco, US, August 2016.
  2. Potthast et al. (2016) https://webis.de/downloads/publications/papers/stein_2016b.pdf
  3. Terrier Stop Word list : https://github.com/terrier-org/terrier-desktop/blob/master/share/stopword-list.txt
  4. Downworthy : https://github.com/snipe/downworthy
  5. Dale Chall Easy word list: http://www.readabilityformulas.com/articles/dale-chall-readability-word-list.php
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].