presidio-researchThis package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Stars: ✭ 62 (-27.06%)
Transformers-TutorialsThis repository contains demos I made with the Transformers library by HuggingFace.
Stars: ✭ 2,828 (+3227.06%)
wechselCode for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (-54.12%)
SkillNERA (smart) rule based NLP module to extract job skills from text
Stars: ✭ 69 (-18.82%)
Spacy Streamlit👑 spaCy building blocks and visualizers for Streamlit apps
Stars: ✭ 360 (+323.53%)
label-studio-transformersLabel data using HuggingFace's transformers and automatically get a prediction service
Stars: ✭ 117 (+37.65%)
golgothaContextualised Embeddings and Language Modelling using BERT and Friends using R
Stars: ✭ 39 (-54.12%)
robo-vlnPytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Stars: ✭ 34 (-60%)
Nlp ArchitectA model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
Stars: ✭ 2,768 (+3156.47%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+3910.59%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-65.88%)
bert-tensorflow-pytorch-spacy-conversionInstructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.
Stars: ✭ 26 (-69.41%)
Text-SummarizationAbstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (-55.29%)
Ner AnnotatorNamed Entity Recognition (NER) Annotation tool for SpaCy. Generates Traning Data as a JSON which can be readily used.
Stars: ✭ 127 (+49.41%)
ParsBigBirdPersian Bert For Long-Range Sequences
Stars: ✭ 58 (-31.76%)
text2classMulti-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT
Stars: ✭ 15 (-82.35%)
bert-squeeze🛠️ Tools for Transformers compression using PyTorch Lightning ⚡
Stars: ✭ 56 (-34.12%)
HugsVisionHugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (+81.18%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-74.12%)
Pytorch Sentiment AnalysisTutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+3675.29%)
Fast BertSuper easy library for BERT based NLP models
Stars: ✭ 1,678 (+1874.12%)
nlp workshop odsc europe20Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…
Stars: ✭ 127 (+49.41%)
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Stars: ✭ 2,235 (+2529.41%)
Bert Bilstm Crf NerTensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
Stars: ✭ 3,838 (+4415.29%)
oreilly-bert-nlpThis repository contains code for the O'Reilly Live Online Training for BERT
Stars: ✭ 19 (-77.65%)
keras-bert-nerKeras solution of Chinese NER task using BiLSTM-CRF/BiGRU-CRF/IDCNN-CRF model with Pretrained Language Model: supporting BERT/RoBERTa/ALBERT
Stars: ✭ 7 (-91.76%)
react-taggyA simple zero-dependency React component for tagging user-defined entities within a block of text.
Stars: ✭ 29 (-65.88%)
Spacy LookupNamed Entity Recognition based on dictionaries
Stars: ✭ 212 (+149.41%)
anonymization-apiHow to build and deploy an anonymization API with FastAPI
Stars: ✭ 51 (-40%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+77.65%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (+72.94%)
TorchBlocksA PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (+0%)
backpropBackprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+169.41%)
parsbert-ner🤗 ParsBERT Persian NER Tasks
Stars: ✭ 15 (-82.35%)
ercEmotion recognition in conversation
Stars: ✭ 34 (-60%)
bangla-bertBangla-Bert is a pretrained bert model for Bengali language
Stars: ✭ 41 (-51.76%)
Tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Stars: ✭ 5,077 (+5872.94%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+10.59%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+2862.35%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+2752.94%)
classyclassy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (-28.24%)
extractacySpacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)
Stars: ✭ 47 (-44.71%)
neuro-comma🇷🇺 Punctuation restoration production-ready model for Russian language 🇷🇺
Stars: ✭ 46 (-45.88%)
scikitcrf NERPython library for custom entity recognition using Sklearn CRF
Stars: ✭ 17 (-80%)
troveWeakly supervised medical named entity classification
Stars: ✭ 55 (-35.29%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+121.18%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+154.12%)
question generatorAn NLP system for generating reading comprehension questions
Stars: ✭ 188 (+121.18%)
KitanaQAKitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (-31.76%)