text2classMulti-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-87.18%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+4239.32%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (-47.86%)
HugsVisionHugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (+31.62%)
Transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+47542.74%)
anonymisationAnonymization of legal cases (Fr) based on Flair embeddings
Stars: ✭ 85 (-27.35%)
Label StudioLabel Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+6108.55%)
Transformers-TutorialsThis repository contains demos I made with the Transformers library by HuggingFace.
Stars: ✭ 2,828 (+2317.09%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-66.67%)
DiscEvalDiscourse Based Evaluation of Language Understanding
Stars: ✭ 18 (-84.62%)
ParsBigBirdPersian Bert For Long-Range Sequences
Stars: ✭ 58 (-50.43%)
ercEmotion recognition in conversation
Stars: ✭ 34 (-70.94%)
robo-vlnPytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Stars: ✭ 34 (-70.94%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+2052.14%)
Nlp ArchitectA model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Stars: ✭ 2,768 (+2265.81%)
Mt DnnMulti-Task Deep Neural Networks for Natural Language Understanding
Stars: ✭ 1,871 (+1499.15%)
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+1810.26%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+29.06%)
label-studio-frontendData labeling react app that is backend agnostic and can be embedded into your applications — distributed as an NPM package
Stars: ✭ 230 (+96.58%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+60.68%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (-19.66%)
bangla-bertBangla-Bert is a pretrained bert model for Bengali language
Stars: ✭ 41 (-64.96%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-81.2%)
trapperState-of-the-art NLP through transformer models in a modular design and consistent APIs.
Stars: ✭ 28 (-76.07%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+2813.68%)
Fast BertSuper easy library for BERT based NLP models
Stars: ✭ 1,678 (+1334.19%)
COCO-LM[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (-6.84%)
Fill-the-GAP[ACL-WS] 4th place solution to gendered pronoun resolution challenge on Kaggle
Stars: ✭ 13 (-88.89%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (-27.35%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+84.62%)
Pytorch Sentiment AnalysisTutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+2642.74%)
Text-SummarizationAbstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (-67.52%)
Bert As ServiceMapping a variable-length sentence to a fixed-length vector using BERT model
Stars: ✭ 9,779 (+8258.12%)
WSDM-Cup-2019[ACM-WSDM] 3rd place solution at WSDM Cup 2019, Fake News Classification on Kaggle.
Stars: ✭ 62 (-47.01%)
backpropBackprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+95.73%)
bert-tensorflow-pytorch-spacy-conversionInstructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.
Stars: ✭ 26 (-77.78%)
bert-squeeze🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Stars: ✭ 56 (-52.14%)
question generatorAn NLP system for generating reading comprehension questions
Stars: ✭ 188 (+60.68%)
golgothaContextualised Embeddings and Language Modelling using BERT and Friends using R
Stars: ✭ 39 (-66.67%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+1972.65%)
oreilly-bert-nlpThis repository contains code for the O'Reilly Live Online Training for BERT
Stars: ✭ 19 (-83.76%)
hard-label-attackNatural Language Attacks in a Hard Label Black Box Setting.
Stars: ✭ 26 (-77.78%)
JointIDSFBERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)
Stars: ✭ 55 (-52.99%)
transformer generalizationThe official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.
Stars: ✭ 58 (-50.43%)
NLP-paper🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-80.34%)
banglabertThis repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
Stars: ✭ 186 (+58.97%)
Cross-Lingual-MRCCross-Lingual Machine Reading Comprehension (EMNLP 2019)
Stars: ✭ 66 (-43.59%)
BertSimilarityComputing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。
Stars: ✭ 348 (+197.44%)
wikiHow paper listA paper list of research conducted based on wikiHow
Stars: ✭ 25 (-78.63%)
dstqaCode for Li Zhou, Kevin Small. Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question Answering. In NeurIPS 2019 Workshop on Conversational AI
Stars: ✭ 26 (-77.78%)
CAIL法研杯CAIL2019阅读理解赛题参赛模型
Stars: ✭ 34 (-70.94%)
NAG-BERT[EACL'21] Non-Autoregressive with Pretrained Language Model
Stars: ✭ 47 (-59.83%)