Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577

Stars: ✭ 216 (+132.26%)

Mutual labels: information-retrieval

Rated Ranking Evaluator

Search Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures

Stars: ✭ 134 (+44.09%)

Mutual labels: information-retrieval

FinBERT-QA

Financial Domain Question Answering with pre-trained BERT Language Model

Stars: ✭ 70 (-24.73%)

Mutual labels: information-retrieval

rust-stemmers

A rust implementation of some popular snowball stemming algorithms

Stars: ✭ 85 (-8.6%)

Mutual labels: information-retrieval

sigir19-neural-ir

Source code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19

Stars: ✭ 44 (-52.69%)

Mutual labels: information-retrieval

HAR

Code for WWW2019 paper "A Hierarchical Attention Retrieval Model for Healthcare Question Answering"

Stars: ✭ 22 (-76.34%)

Mutual labels: information-retrieval

Conceptualsearch

Train a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs

Stars: ✭ 245 (+163.44%)

Mutual labels: information-retrieval

netizenship

a commandline #OSINT tool to find the online presence of a username in popular social media websites like Facebook, Instagram, Twitter, etc.

Stars: ✭ 33 (-64.52%)

Mutual labels: information-retrieval

ml4ir

Machine Learning for Information Retrieval

Stars: ✭ 75 (-19.35%)

Mutual labels: information-retrieval

Ranknet

My (slightly modified) Keras implementation of RankNet and PyTorch implementation of LambdaRank.

Stars: ✭ 211 (+126.88%)

Mutual labels: information-retrieval

beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Stars: ✭ 738 (+693.55%)

Mutual labels: information-retrieval

Hdltex

HDLTex: Hierarchical Deep Learning for Text Classification

Stars: ✭ 191 (+105.38%)

Mutual labels: information-retrieval

3d model retriever

Experimenting with a newly published deep learning paper and how it can be used for content-based 3D model retrieval. (info retrieval for CAD)

Stars: ✭ 45 (-51.61%)

Mutual labels: information-retrieval

Openmatch

An Open-Source Package for Information Retrieval.

Stars: ✭ 186 (+100%)

Mutual labels: information-retrieval

solr

Apache Solr open-source search software

Stars: ✭ 651 (+600%)

Mutual labels: information-retrieval

Neuralqa

NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

Stars: ✭ 185 (+98.92%)

Mutual labels: information-retrieval

tutorials

A tutorial series by Preferred.AI

Stars: ✭ 136 (+46.24%)

Mutual labels: information-retrieval

Ranking

Learning to Rank in TensorFlow

Stars: ✭ 2,362 (+2439.78%)

Mutual labels: information-retrieval

ConvDR

Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"

Stars: ✭ 36 (-61.29%)

Mutual labels: information-retrieval

Bm25

A Python implementation of the BM25 ranking function.

Stars: ✭ 159 (+70.97%)

Mutual labels: information-retrieval

EMNLP2020

This is official Pytorch code and datasets of the paper "Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News", EMNLP 2020.

Stars: ✭ 55 (-40.86%)

Mutual labels: information-retrieval

Gensim

Topic Modelling for Humans

Stars: ✭ 12,763 (+13623.66%)

Mutual labels: information-retrieval

crawlzone

Crawlzone is a fast asynchronous internet crawling framework for PHP.

Stars: ✭ 70 (-24.73%)

Mutual labels: web-search

Pyserini

Python interface to the Anserini IR toolkit built on Lucene

Stars: ✭ 148 (+59.14%)

Mutual labels: information-retrieval

ml-nlp-services

机器学习、深度学习、自然语言处理

Stars: ✭ 23 (-75.27%)

Mutual labels: information-retrieval

Invoicenet

Deep neural network to extract intelligent information from invoice documents.

Stars: ✭ 1,886 (+1927.96%)

Mutual labels: information-retrieval

query-wellformedness

25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.

Stars: ✭ 80 (-13.98%)

Mutual labels: information-retrieval

Easyocr

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Stars: ✭ 13,379 (+14286.02%)

Mutual labels: information-retrieval

naacl2018-fever

Fact Extraction and VERification baseline published in NAACL2018

Stars: ✭ 109 (+17.2%)

Mutual labels: information-retrieval

Foundry

The Cognitive Foundry is an open-source Java library for building intelligent systems using machine learning

Stars: ✭ 124 (+33.33%)

Mutual labels: information-retrieval

patzilla

PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.

Stars: ✭ 71 (-23.66%)

Mutual labels: information-retrieval

BM25Transformer

(Python) transform a document-term matrix to an Okapi/BM25 representation

Stars: ✭ 50 (-46.24%)

Mutual labels: information-retrieval

src

tools for fast reading of docs

Stars: ✭ 40 (-56.99%)

Mutual labels: information-retrieval

AILA-Artificial-Intelligence-for-Legal-Assistance

Python implementations of the various methods used in FIRE 2019 conference.