🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.

Stars: ✭ 3,409 (-48.78%)

Mutual labels: question-answering, language-model, bert

BERT-chinese-text-classification-pytorch

This repo contains a PyTorch implementation of a pretrained BERT model for text classification.

Stars: ✭ 92 (-98.62%)

Mutual labels: text-classification, chinese, bert

text-classification-cn

中文文本分类实践，基于搜狗新闻语料库，采用传统机器学习方法以及预训练模型等方法

Stars: ✭ 81 (-98.78%)

Mutual labels: text-classification, word2vec, corpus

SQUAD2.Q-Augmented-Dataset

Augmented version of SQUAD 2.0 for Questions

Stars: ✭ 31 (-99.53%)

Mutual labels: question-answering, bert

WSDM-Cup-2019

[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.

Stars: ✭ 62 (-99.07%)

Mutual labels: text-classification, bert

Chatito

🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!

Stars: ✭ 678 (-89.81%)

Mutual labels: dataset, text-classification

LightLM

高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task

Stars: ✭ 54 (-99.19%)

Mutual labels: chinese, bert

Product-Categorization-NLP

Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).

Stars: ✭ 30 (-99.55%)

Mutual labels: text-classification, word2vec

textgo

Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!

Stars: ✭ 33 (-99.5%)

Mutual labels: text-classification, bert

NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Stars: ✭ 166 (-97.51%)

Mutual labels: text-classification, bert

text2text

Text2Text: Cross-lingual natural language processing and generation toolkit

Stars: ✭ 188 (-97.18%)

Mutual labels: question-answering, bert

cdQA-ui

⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.

Stars: ✭ 19 (-99.71%)

Mutual labels: question-answering, bert

mcQA

🔮 Answering multiple choice questions with Language Models.

Stars: ✭ 23 (-99.65%)

Mutual labels: question-answering, bert

Species-Names-Corpus

物种名称语料库。植物名,动物名。

Stars: ✭ 23 (-99.65%)

Mutual labels: corpus, dataset

CLUEmotionAnalysis2020

CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Stars: ✭ 3 (-99.95%)

Mutual labels: corpus, chinese

ganbert-pytorch

Enhancing the BERT training with Semi-supervised Generative Adversarial Networks in Pytorch/HuggingFace

Stars: ✭ 60 (-99.1%)

Mutual labels: text-classification, bert

Ngram2vec

Four word embedding models implemented in Python. Supporting arbitrary context features

Stars: ✭ 703 (-89.44%)

Mutual labels: chinese, word2vec

feedIO

A Feed Aggregator that Knows What You Want to Read.

Stars: ✭ 26 (-99.61%)

Mutual labels: news, text-classification

Medi-CoQA

Conversational Question Answering on Clinical Text

Stars: ✭ 22 (-99.67%)

Mutual labels: question-answering, bert

bert tokenization for java

This is a java version of Chinese tokenization descried in BERT.

Stars: ✭ 39 (-99.41%)

Mutual labels: chinese-nlp, bert

TorchBlocks

A PyTorch-based toolkit for natural language processing

Stars: ✭ 85 (-98.72%)

Mutual labels: text-classification, bert

Pytorch-NLU

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…

Stars: ✭ 151 (-97.73%)

Mutual labels: text-classification, bert

text2class

Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT

Stars: ✭ 15 (-99.77%)

Mutual labels: text-classification, bert

chinese-nlp-ner

一套针对中文实体识别的BLSTM-CRF解决方案

Stars: ✭ 14 (-99.79%)

Mutual labels: chinese, chinese-nlp

Giveme5W

Extraction of the five journalistic W-questions (5W) from news articles

Stars: ✭ 16 (-99.76%)

Mutual labels: news, question-answering

Text and Audio classification with Bert

Text Classification in Turkish Texts with Bert

Stars: ✭ 34 (-99.49%)

Mutual labels: text-classification, bert

MobileQA

离线端阅读理解应用 QA for mobile, Android & iPhone

Stars: ✭ 49 (-99.26%)

Mutual labels: chinese, bert

kwx

BERT, LDA, and TFIDF based keyword extraction in Python

Stars: ✭ 33 (-99.5%)

Mutual labels: text-classification, bert

wordfish-python

extract relationships from standardized terms from corpus of interest with deep learning 🐟

Stars: ✭ 19 (-99.71%)

Mutual labels: word2vec, corpus

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

Stars: ✭ 22 (-99.67%)

Mutual labels: text-classification, bert

squad-v1.1-pt

Portuguese translation of the SQuAD dataset

Stars: ✭ 13 (-99.8%)

Mutual labels: dataset, question-answering

ODSQA

ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET

Stars: ✭ 43 (-99.35%)

Mutual labels: question-answering, chinese

Medical-Names-Corpus

医疗语料库。医疗机构名语料库。药品本位码。