FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (+62.79%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+402.33%)
SWDMSIGIR 2017: Embedding-based query expansion for weighted sequential dependence retrieval model
Stars: ✭ 35 (-18.6%)
cdQA-ui⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-55.81%)
beirA Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Stars: ✭ 738 (+1616.28%)
text2textText2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+337.21%)
Haystack🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+7827.91%)
TwinBertpytorch implementation of the TwinBert paper
Stars: ✭ 36 (-16.28%)
solrApache Solr open-source search software
Stars: ✭ 651 (+1413.95%)
OpenUEOpenUE是一个轻量级知识图谱抽取工具 (An Open Toolkit for Universal Extraction from Text published at EMNLP2020: https://aclanthology.org/2020.emnlp-demos.1.pdf)
Stars: ✭ 274 (+537.21%)
ImageRetrievalContent Based Image Retrieval Techniques (e.g. knn, svm using MatLab GUI)
Stars: ✭ 51 (+18.6%)
neuro-comma🇷🇺 Punctuation restoration production-ready model for Russian language 🇷🇺
Stars: ✭ 46 (+6.98%)
cmrc2019A Sentence Cloze Dataset for Chinese Machine Reading Comprehension (CMRC 2019)
Stars: ✭ 118 (+174.42%)
R-ATRegularized Adversarial Training
Stars: ✭ 19 (-55.81%)
BERTOverflowA Pre-trained BERT on StackOverflow Corpus
Stars: ✭ 40 (-6.98%)
bert-sentimentFine-grained Sentiment Classification Using BERT
Stars: ✭ 49 (+13.95%)
COVID19-IRQANo description or website provided.
Stars: ✭ 32 (-25.58%)
AliceMindALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
Stars: ✭ 1,479 (+3339.53%)
vietnamese-robertaA Robustly Optimized BERT Pretraining Approach for Vietnamese
Stars: ✭ 22 (-48.84%)
TradeTheEventImplementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (+48.84%)
pn-summaryA well-structured summarization dataset for the Persian language!
Stars: ✭ 29 (-32.56%)
npo classifierAutomated coding using machine-learning and remapping the U.S. nonprofit sector: A guide and benchmark
Stars: ✭ 18 (-58.14%)
Fill-the-GAP[ACL-WS] 4th place solution to gendered pronoun resolution challenge on Kaggle
Stars: ✭ 13 (-69.77%)
rust-stemmersA rust implementation of some popular snowball stemming algorithms
Stars: ✭ 85 (+97.67%)
IR-exercisesSolutions of the various test exams of the Information Retrieval course
Stars: ✭ 28 (-34.88%)
wisdomifyA BERT-based reverse dictionary of Korean proverbs
Stars: ✭ 95 (+120.93%)
LuceneTutorialA simple tutorial of Lucene for LIS 501 Introduction to Text Mining students at the University of Wisconsin-Madison (Fall 2021).
Stars: ✭ 62 (+44.19%)
query-wellformedness25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Stars: ✭ 80 (+86.05%)
ProQAProgressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
Stars: ✭ 44 (+2.33%)
Kaleido-BERT(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain.
Stars: ✭ 252 (+486.05%)
bert attn vizVisualize BERT's self-attention layers on text classification tasks
Stars: ✭ 41 (-4.65%)
py-lingualyticsA text analytics library with support for codemixed data
Stars: ✭ 36 (-16.28%)
LAMB Optimizer TFLAMB Optimizer for Large Batch Training (TensorFlow version)
Stars: ✭ 119 (+176.74%)
patzillaPatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Stars: ✭ 71 (+65.12%)
TriB-QA吹逼我们是认真的
Stars: ✭ 45 (+4.65%)
pqlite⚡ A fast embedded library for approximate nearest neighbor search
Stars: ✭ 141 (+227.91%)
netizenshipa commandline #OSINT tool to find the online presence of a username in popular social media websites like Facebook, Instagram, Twitter, etc.
Stars: ✭ 33 (-23.26%)
DrFAQDrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-32.56%)
TabFormerCode & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
Stars: ✭ 209 (+386.05%)
ConvDRCode repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
Stars: ✭ 36 (-16.28%)
AiSpaceAiSpace: Better practices for deep learning model development and deployment For Tensorflow 2.0
Stars: ✭ 28 (-34.88%)
sigir19-neural-irSource code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19
Stars: ✭ 44 (+2.33%)
sisterSImple SenTence EmbeddeR
Stars: ✭ 66 (+53.49%)
Cool-NLPCVSome Cool NLP and CV Repositories and Solutions (收集NLP中常见任务的开源解决方案、数据集、工具、学习资料等)
Stars: ✭ 143 (+232.56%)
NLPDataAugmentationChinese NLP Data Augmentation, BERT Contextual Augmentation
Stars: ✭ 94 (+118.6%)
naacl2018-feverFact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (+153.49%)