Sequence Semantic EmbeddingTools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc.
Stars: ✭ 435 (+1453.57%)
BooksBooks worth spreading
Stars: ✭ 161 (+475%)
Osi.igInformation Gathering Instagram.
Stars: ✭ 377 (+1246.43%)
Ds2iA library of inverted index data structures
Stars: ✭ 104 (+271.43%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+1192.86%)
PwnbackBurp Extender plugin that generates a sitemap of a website using Wayback Machine
Stars: ✭ 203 (+625%)
GetaltnameExtract subdomains from SSL certificates in HTTPS sites.
Stars: ✭ 320 (+1042.86%)
FlexneuartFlexible classic and NeurAl Retrieval Toolkit
Stars: ✭ 99 (+253.57%)
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+992.86%)
Sf1r LiteSearch Formula-1——A distributed high performance massive data engine for enterprise/vertical search
Stars: ✭ 158 (+464.29%)
AllrankallRank is a framework for training learning-to-rank neural models based on PyTorch.
Stars: ✭ 269 (+860.71%)
ForteForte is a flexible and powerful NLP builder FOR TExt. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 89 (+217.86%)
ai-distilleryAutomatically modelling and distilling knowledge within AI. In other words, summarising the AI research firehose.
Stars: ✭ 20 (-28.57%)
TrinityTrinity IR Infrastructure
Stars: ✭ 227 (+710.71%)
SolrConfigExamplesExamples of Solr configuration entries for Solr plugins and Conceptual Search\Semantic Search from Simon Hughes Dice.com
Stars: ✭ 26 (-7.14%)
Pyndripyndri is a Python interface to the Indri search engine.
Stars: ✭ 85 (+203.57%)
cdQA-ui⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-32.14%)
VectorsinsearchDice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Stars: ✭ 71 (+153.57%)
autocompleteEfficient and effective query auto-completion in C++.
Stars: ✭ 28 (+0%)
Rank bm25A Collection of BM25 Algorithms in Python
Stars: ✭ 187 (+567.86%)
cherche📑 Neural Search
Stars: ✭ 196 (+600%)
GaanaapiUnofficial Gaana API
Stars: ✭ 59 (+110.71%)
wsdm-digg-2020No description or website provided.
Stars: ✭ 15 (-46.43%)
Tutorial Utilizing KgResources for Tutorial on "Utilizing Knowledge Graphs in Text-centric Information Retrieval"
Stars: ✭ 148 (+428.57%)
information retrieval systemThe goal of this project is to implement a basic information retrieval system using Python, NLTK and GenSIM.
Stars: ✭ 25 (-10.71%)
tika-similarityTika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Stars: ✭ 92 (+228.57%)
ComposeAEOfficial code for WACV 2021 paper - Compositional Learning of Image-Text Query for Image Retrieval
Stars: ✭ 49 (+75%)
Domain discovery toolThis repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better understand a domain (or topic) as it is represented on the Web.
Stars: ✭ 33 (+17.86%)
JPQCIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.
Stars: ✭ 39 (+39.29%)
intergoA package for interleaving / multileaving ranking generation in go
Stars: ✭ 30 (+7.14%)
PkePython Keyphrase Extraction module
Stars: ✭ 855 (+2953.57%)
lldaLabeled LDA in Python
Stars: ✭ 19 (-32.14%)
Vec4irWord Embeddings for Information Retrieval
Stars: ✭ 188 (+571.43%)
ir datasetsProvides a common interface to many IR ranking datasets.
Stars: ✭ 190 (+578.57%)
Date InfoAPI to let user fetch the events that happen(ed) on a specific date
Stars: ✭ 7 (-75%)
ake-datasetsLarge, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
Stars: ✭ 125 (+346.43%)
Rated Ranking EvaluatorSearch Quality Evaluation Tool for Apache Solr & Elasticsearch search-based infrastructures
Stars: ✭ 134 (+378.57%)
FieldedSDMFielded Sequential Dependence Model (code and runs)
Stars: ✭ 32 (+14.29%)
FxtA large scale feature extraction tool for text-based machine learning
Stars: ✭ 25 (-10.71%)
AquiladbDrop in solution for Decentralized Neural Information Retrieval. Index latent vectors along with JSON metadata and do efficient k-NN search.
Stars: ✭ 222 (+692.86%)
IP-TrackerTrack any ip address with IP-Tracker. IP-Tracker is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracker.
Stars: ✭ 53 (+89.29%)
cs6101The Web IR / NLP Group (WING)'s public reading group at the National University of Singapore.
Stars: ✭ 17 (-39.29%)
Dan Jurafsky Chris Manning NlpMy solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (+342.86%)
DRhardSIGIR'21: Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track.
Stars: ✭ 93 (+232.14%)
AnseriniA Lucene toolkit for replicable information retrieval research
Stars: ✭ 573 (+1946.43%)
tutorialsA tutorial series by Preferred.AI
Stars: ✭ 136 (+385.71%)
K NrmK-NRM: End-to-End Neural Ad-hoc Ranking with Kernel Pooling
Stars: ✭ 183 (+553.57%)
PisaPISA: Performant Indexes and Search for Academia
Stars: ✭ 489 (+1646.43%)
Deep Semantic Similarity ModelMy Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
Stars: ✭ 509 (+1717.86%)
sigir19-neural-irSource code for: On the Effect of Low-Frequency Terms on Neural-IR Models, SIGIR'19
Stars: ✭ 44 (+57.14%)
ConceptualsearchTrain a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Stars: ✭ 245 (+775%)
RanknetMy (slightly modified) Keras implementation of RankNet and PyTorch implementation of LambdaRank.
Stars: ✭ 211 (+653.57%)
RankingLearning to Rank in TensorFlow
Stars: ✭ 2,362 (+8335.71%)
Scilla🏴☠️ Information Gathering tool 🏴☠️ DNS / Subdomains / Ports / Directories enumeration
Stars: ✭ 116 (+314.29%)