All Projects → webis-de → small-text

webis-de / small-text

Licence: MIT license
Active Learning for Text Classification in Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to small-text

policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-90.87%)
Mutual labels:  text-classification, transformers, active-learning
Text and Audio classification with Bert
Text Classification in Turkish Texts with Bert
Stars: ✭ 34 (-85.89%)
Mutual labels:  text-classification, transformers
TorchBlocks
A PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-64.73%)
Mutual labels:  text-classification, transformers
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+944.81%)
Mutual labels:  text-classification, transformers
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-87.55%)
Mutual labels:  text-classification, transformers
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (-4.98%)
Mutual labels:  text-classification, transformers
text2class
Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-93.78%)
Mutual labels:  text-classification, transformers
Simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Stars: ✭ 2,881 (+1095.44%)
Mutual labels:  text-classification, transformers
X-Transformer
X-Transformer: Taming Pretrained Transformers for eXtreme Multi-label Text Classification
Stars: ✭ 127 (-47.3%)
Mutual labels:  text-classification, transformers
Ask2Transformers
A Framework for Textual Entailment based Zero Shot text classification
Stars: ✭ 102 (-57.68%)
Mutual labels:  text-classification, transformers
COVID-19-Tweet-Classification-using-Roberta-and-Bert-Simple-Transformers
Rank 1 / 216
Stars: ✭ 24 (-90.04%)
Mutual labels:  text-classification, transformers
text-classification-transformers
Easy text classification for everyone : Bert based models via Huggingface transformers (KR / EN)
Stars: ✭ 32 (-86.72%)
Mutual labels:  text-classification, transformers
Pytorch-NLU
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (-37.34%)
Mutual labels:  text-classification, transformers
yunyi
2018“云移杯- 景区口碑评价分值预测
Stars: ✭ 29 (-87.97%)
Mutual labels:  text-classification
MONAILabel
MONAI Label is an intelligent open source image labeling and learning tool.
Stars: ✭ 249 (+3.32%)
Mutual labels:  active-learning
remixer-pytorch
Implementation of the Remixer Block from the Remixer paper, in Pytorch
Stars: ✭ 37 (-84.65%)
Mutual labels:  transformers
optimum
🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools
Stars: ✭ 567 (+135.27%)
Mutual labels:  transformers
nlp classification
Implementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 224 (-7.05%)
Mutual labels:  text-classification
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (-82.16%)
Mutual labels:  text-classification
AlpacaTag
AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging (ACL 2019 Demo)
Stars: ✭ 126 (-47.72%)
Mutual labels:  active-learning

PyPI codecov Documentation Status Maintained Yes Contributions Welcome GitHub

small-text logo

Active Learning for Text Classifcation in Python.


Installation | Quick Start | Changelog | Docs

Small-Text provides state-of-the-art Active Learning for Text Classification. Several pre-implemented Query Strategies, Initialization Strategies, and Stopping Critera are provided, which can be easily mixed and matched to build active learning experiments or applications.

What is Active Learning?
Active Learning allows you to efficiently label training data in a small data scenario.

Features

  • Provides unified interfaces for Active Learning so that you can easily mix and match query strategies with classifiers provided by sklearn, Pytorch, or transformers.
  • Supports GPU-based Pytorch models and integrates transformers so that you can use state-of-the-art Text Classification models for Active Learning.
  • GPU is supported but not required. In case of a CPU-only use case, a lightweight installation only requires a minimal set of dependencies.
  • Multiple scientifically evaluated components are pre-implemented and ready to use (Query Strategies, Initialization Strategies, and Stopping Criteria).

News

  • May Beta Release (v1.0.0b4) - May 04, 2022

  • March Beta Release (v1.0.0b3) - March 06, 2022

  • 🎉 Beta Release (v1.0.0b1) - February 22, 2022

    • New features: multi-label classification and stopping criteria are now supported.
    • Added/revised large parts of the documentation.

For a complete list of changes, see the change log.

Installation

Small-Text can be easily installed via pip:

pip install small-text

For a full installation include the transformers extra requirement:

pip install small-text[transformers]

It requires Python 3.7 or newer. For using the GPU, CUDA 10.1 or newer is required. More information regarding the installation can be found in the documentation.

Quick Start

For a quick start, see the provided examples for binary classification, pytorch multi-class classification, and transformer-based multi-class classification, or check out the notebooks.

Notebooks

# Notebook
1 Intro: Active Learning for Text Classification with Small-Text Open In Colab
2 Using Stopping Criteria for Active Learning Open In Colab

Documentation

Read the latest documentation here. Noteworthy pages include:

Alternatives

modAL, ALiPy, libact

Contribution

Contributions are welcome. Details can be found in CONTRIBUTING.md.

Acknowledgments

This software was created by Christopher Schröder (@chschroeder) at Leipzig University's NLP group which is a part of the Webis research network. The encompassing project was funded by the Development Bank of Saxony (SAB) under project number 100335729.

Citation

A preprint which introduces small-text is available here:
Small-Text: Active Learning for Text Classification in Python.

@misc{schroeder2021smalltext,
    title={Small-Text: Active Learning for Text Classification in Python}, 
    author={Christopher Schröder and Lydia Müller and Andreas Niekler and Martin Potthast},
    year={2021},
    eprint={2107.10314},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

MIT License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].