Ai lawall kinds of baseline models for long text classificaiton( text categorization)
Text ClassificationMachine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Fancy NlpNLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
Pytorch Transformers ClassificationBased on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.
PaddlenlpNLP Core Library and Model Zoo based on PaddlePaddle 2.0
Interpret TextA library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard.
BandBAND:BERT Application aNd Deployment,Simple and efficient BERT model training and deployment, 简单高效的 BERT 模型训练和部署
Icdar 2019 SroieICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
Nlp classificationImplementing nlp papers relevant to classification with PyTorch, gluonnlp
ShallowlearnAn experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Marktool这是一款基于web的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持文本的迭代标注和实体的嵌套标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,可对多人的标注结果进行一致性检验和调整,提高了标注语料的准确率和可靠性。
Pyss3A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
SimpletransformersTransformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
FastnlpfastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
TextvecText vectorization tool to outperform TFIDF for classification tasks
TextanalyzerA text analyzer which is based on machine learning,statistics and dictionaries that can analyze text. So far, it supports hot word extracting, text classification, part of speech tagging, named entity recognition, chinese word segment, extracting address, synonym, text clustering, word2vec model, edit distance, chinese word segment, sentence similarity,word sentiment tendency, name recognition, idiom recognition, placename recognition, organization recognition, traditional chinese recognition, pinyin transform.
Lotclass[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
VdcnnImplementation of Very Deep Convolutional Neural Network for Text Classification
MacadamMacadam是一个以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入; 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法; 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法。
BrowsecloudA web app to create and browse text visualizations for automated customer listening.
Uda pytorchUDA(Unsupervised Data Augmentation) implemented by pytorch
Monkeylearn PythonOfficial Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.
Onnxt5Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Parselawdocuments对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。
Nlp estimator tutorialEducational material on using the TensorFlow Estimator framework for text classification
Ml ProjectsML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python
Rcnn Text ClassificationTensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)
ContextConText v4: Neural networks for text categorization
Bdci2017 MinglueBDCI2017-让AI当法官,决赛第四(4/415)https://www.datafountain.cn/competitions/277/details