JPQCIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.
Stars: ✭ 39 (-58.06%)
AquiladbDrop in solution for Decentralized Neural Information Retrieval. Index latent vectors along with JSON metadata and do efficient k-NN search.
Stars: ✭ 222 (+138.71%)
K NrmK-NRM: End-to-End Neural Ad-hoc Ranking with Kernel Pooling
Stars: ✭ 183 (+96.77%)
ComposeAEOfficial code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval
Stars: ✭ 49 (-47.31%)
CatalystAccelerated deep learning R&D
Stars: ✭ 2,804 (+2915.05%)
Rank bm25A Collection of BM25 Algorithms in Python
Stars: ✭ 187 (+101.08%)
MixGCFMixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems, KDD2021
Stars: ✭ 73 (-21.51%)
Sf1r LiteSearch Formula-1——A distributed high performance massive data engine for enterprise/vertical search
Stars: ✭ 158 (+69.89%)
ImageRetrievalContent Based Image Retrieval Techniques (e.g. knn, svm using MatLab GUI)
Stars: ✭ 51 (-45.16%)
Tutorial Utilizing KgResources for Tutorial on "Utilizing Knowledge Graphs in Text-centric Information Retrieval"
Stars: ✭ 148 (+59.14%)
IR-exercisesSolutions of the various test exams of the Information Retrieval course
Stars: ✭ 28 (-69.89%)
COVID19-IRQANo description or website provided.
Stars: ✭ 32 (-65.59%)
TrinityTrinity IR Infrastructure
Stars: ✭ 227 (+144.09%)
kexKex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.
Stars: ✭ 46 (-50.54%)
PwnbackBurp Extender plugin that generates a sitemap of a website using Wayback Machine
Stars: ✭ 203 (+118.28%)
LuceneTutorialA simple tutorial of Lucene for LIS 501 Introduction to Text Mining students at the University of Wisconsin-Madison (Fall 2021).
Stars: ✭ 62 (-33.33%)
Vec4irWord Embeddings for Information Retrieval
Stars: ✭ 188 (+102.15%)
GNN-Recommender-SystemsAn index of recommendation algorithms that are based on Graph Neural Networks.
Stars: ✭ 505 (+443.01%)
BooksBooks worth spreading
Stars: ✭ 161 (+73.12%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (-35.48%)
BERT-QECode and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".
Stars: ✭ 43 (-53.76%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+132.26%)
Rated Ranking EvaluatorSearch Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures
Stars: ✭ 134 (+44.09%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (-24.73%)
rust-stemmersA rust implementation of some popular snowball stemming algorithms
Stars: ✭ 85 (-8.6%)
sigir19-neural-irSource code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19
Stars: ✭ 44 (-52.69%)
HARCode for WWW2019 paper "A Hierarchical Attention Retrieval Model for Healthcare Question Answering"
Stars: ✭ 22 (-76.34%)
ConceptualsearchTrain a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Stars: ✭ 245 (+163.44%)
netizenshipa commandline #OSINT tool to find the online presence of a username in popular social media websites like Facebook, Instagram, Twitter, etc.
Stars: ✭ 33 (-64.52%)
ml4irMachine Learning for Information Retrieval
Stars: ✭ 75 (-19.35%)
RanknetMy (slightly modified) Keras implementation of RankNet and PyTorch implementation of LambdaRank.
Stars: ✭ 211 (+126.88%)
beirA Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Stars: ✭ 738 (+693.55%)
HdltexHDLTex: Hierarchical Deep Learning for Text Classification
Stars: ✭ 191 (+105.38%)
3d model retrieverExperimenting with a newly published deep learning paper and how it can be used for content-based 3D model retrieval. (info retrieval for CAD)
Stars: ✭ 45 (-51.61%)
OpenmatchAn Open-Source Package for Information Retrieval.
Stars: ✭ 186 (+100%)
solrApache Solr open-source search software
Stars: ✭ 651 (+600%)
NeuralqaNeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
Stars: ✭ 185 (+98.92%)
tutorialsA tutorial series by Preferred.AI
Stars: ✭ 136 (+46.24%)
RankingLearning to Rank in TensorFlow
Stars: ✭ 2,362 (+2439.78%)
ConvDRCode repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
Stars: ✭ 36 (-61.29%)
Bm25A Python implementation of the BM25 ranking function.
Stars: ✭ 159 (+70.97%)
EMNLP2020This is official Pytorch code and datasets of the paper "Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News", EMNLP 2020.
Stars: ✭ 55 (-40.86%)
GensimTopic Modelling for Humans
Stars: ✭ 12,763 (+13623.66%)
crawlzoneCrawlzone is a fast asynchronous internet crawling framework for PHP.
Stars: ✭ 70 (-24.73%)
PyseriniPython interface to the Anserini IR toolkit built on Lucene
Stars: ✭ 148 (+59.14%)
InvoicenetDeep neural network to extract intelligent information from invoice documents.
Stars: ✭ 1,886 (+1927.96%)
query-wellformedness25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Stars: ✭ 80 (-13.98%)
EasyocrReady-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+14286.02%)
naacl2018-feverFact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (+17.2%)
FoundryThe Cognitive Foundry is an open-source Java library for building intelligent systems using machine learning
Stars: ✭ 124 (+33.33%)
patzillaPatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Stars: ✭ 71 (-23.66%)
BM25Transformer(Python) transform a document-term matrix to an Okapi/BM25 representation
Stars: ✭ 50 (-46.24%)
srctools for fast reading of docs
Stars: ✭ 40 (-56.99%)
ProQAProgressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
Stars: ✭ 44 (-52.69%)