policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-97.02%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+361.92%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (-74.53%)
question generatorAn NLP system for generating reading comprehension questions
Stars: ✭ 188 (-74.53%)
BERT-QECode and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".
Stars: ✭ 43 (-94.17%)
nalcosSearch Git commits in natural language
Stars: ✭ 50 (-93.22%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-90.51%)
OpenDialogAn Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (-87.26%)
Transformer-QG-on-SQuADImplement Question Generator with SOTA pre-trained Language Models (RoBERTa, BERT, GPT, BART, T5, etc.)
Stars: ✭ 28 (-96.21%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (-70.73%)
FieldedSDMFielded Sequential Dependence Model (code and runs)
Stars: ✭ 32 (-95.66%)
AnnA Anki neuronal AppendixUsing machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (-94.72%)
cherche📑 Neural Search
Stars: ✭ 196 (-73.44%)
cdQA-ui⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-97.43%)
question-generationNeural Models for Key Phrase Detection and Question Generation
Stars: ✭ 29 (-96.07%)
neuro-comma🇷🇺 Punctuation restoration production-ready model for Russian language 🇷🇺
Stars: ✭ 46 (-93.77%)
FocusSeq2Seq[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation (Question Generation / Abstractive Summarization)
Stars: ✭ 109 (-85.23%)
pn-summaryA well-structured summarization dataset for the Persian language!
Stars: ✭ 29 (-96.07%)
palladianPalladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Stars: ✭ 32 (-95.66%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-96.07%)
cmrc2019A Sentence Cloze Dataset for Chinese Machine Reading Comprehension (CMRC 2019)
Stars: ✭ 118 (-84.01%)
AiSpaceAiSpace: Better practices for deep learning model development and deployment For Tensorflow 2.0
Stars: ✭ 28 (-96.21%)
unsupervised-qaTemplate-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Stars: ✭ 47 (-93.63%)
ImageRetrievalContent Based Image Retrieval Techniques (e.g. knn, svm using MatLab GUI)
Stars: ✭ 51 (-93.09%)
pqlite⚡ A fast embedded library for approximate nearest neighbor search
Stars: ✭ 141 (-80.89%)
TradeTheEventImplementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (-91.33%)
TwinBertpytorch implementation of the TwinBert paper
Stars: ✭ 36 (-95.12%)
sigir19-neural-irSource code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19
Stars: ✭ 44 (-94.04%)
BERTOverflowA Pre-trained BERT on StackOverflow Corpus
Stars: ✭ 40 (-94.58%)
npo classifierAutomated coding using machine-learning and remapping the U.S. nonprofit sector: A guide and benchmark
Stars: ✭ 18 (-97.56%)
wisdomifyA BERT-based reverse dictionary of Korean proverbs
Stars: ✭ 95 (-87.13%)
IR-exercisesSolutions of the various test exams of the Information Retrieval course
Stars: ✭ 28 (-96.21%)
query-wellformedness25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Stars: ✭ 80 (-89.16%)
LAMB Optimizer TFLAMB Optimizer for Large Batch Training (TensorFlow version)
Stars: ✭ 119 (-83.88%)
ConvDRCode repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
Stars: ✭ 36 (-95.12%)
OpenUEOpenUE是一个轻量级知识图谱抽取工具 (An Open Toolkit for Universal Extraction from Text published at EMNLP2020: https://aclanthology.org/2020.emnlp-demos.1.pdf)
Stars: ✭ 274 (-62.87%)
Cool-NLPCVSome Cool NLP and CV Repositories and Solutions (收集NLP中常见任务的开源解决方案、数据集、工具、学习资料等)
Stars: ✭ 143 (-80.62%)
solrApache Solr open-source search software
Stars: ✭ 651 (-11.79%)
ComposeAEOfficial code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval
Stars: ✭ 49 (-93.36%)
Kaleido-BERT(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain.
Stars: ✭ 252 (-65.85%)
ConceptualsearchTrain a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Stars: ✭ 245 (-66.8%)
sisterSImple SenTence EmbeddeR
Stars: ✭ 66 (-91.06%)
TrinityTrinity IR Infrastructure
Stars: ✭ 227 (-69.24%)
CatalystAccelerated deep learning R&D
Stars: ✭ 2,804 (+279.95%)
AquiladbDrop in solution for Decentralized Neural Information Retrieval. Index latent vectors along with JSON metadata and do efficient k-NN search.
Stars: ✭ 222 (-69.92%)
AliceMindALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
Stars: ✭ 1,479 (+100.41%)
NLPDataAugmentationChinese NLP Data Augmentation, BERT Contextual Augmentation
Stars: ✭ 94 (-87.26%)
RanknetMy (slightly modified) Keras implementation of RankNet and PyTorch implementation of LambdaRank.
Stars: ✭ 211 (-71.41%)
PwnbackBurp Extender plugin that generates a sitemap of a website using Wayback Machine
Stars: ✭ 203 (-72.49%)
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (-74.12%)