Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ucasir → Nprf

ucasir / Nprf

Licence: apache-2.0

NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval

Programming Languages

139335 projects - #7 most used programming language

Labels

neural-network information-retrieval

Projects that are alternatives of or similar to Nprf

Information Gathering Instagram.

Stars: ✭ 377 (+1116.13%)

Mutual labels: information-retrieval

Deep Semantic Similarity Model

My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.

Stars: ✭ 509 (+1541.94%)

Mutual labels: information-retrieval

A large scale feature extraction tool for text-based machine learning

Stars: ✭ 25 (-19.35%)

Mutual labels: information-retrieval

Sequence Semantic Embedding

Tools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc.

Stars: ✭ 435 (+1303.23%)

Mutual labels: information-retrieval

PISA: Performant Indexes and Search for Academia

Stars: ✭ 489 (+1477.42%)

Mutual labels: information-retrieval

A Lucene toolkit for replicable information retrieval research

Stars: ✭ 573 (+1748.39%)

Mutual labels: information-retrieval

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Stars: ✭ 362 (+1067.74%)

Mutual labels: information-retrieval

Knowledge Graphs

A collection of research on knowledge graphs

Stars: ✭ 845 (+2625.81%)

Mutual labels: information-retrieval

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.

Stars: ✭ 500 (+1512.9%)

Mutual labels: information-retrieval

Relevancyfeedback

Dice.com's relevancy feedback solr plugin created by Simon Hughes (Dice). Contains request handlers for doing MLT style recommendations, conceptual search, semantic search and personalized search

Stars: ✭ 19 (-38.71%)

Mutual labels: information-retrieval

Apache Lucene and Solr open-source search software

Stars: ✭ 4,217 (+13503.23%)

Mutual labels: information-retrieval

Awesome Persian Nlp Ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Stars: ✭ 460 (+1383.87%)

Mutual labels: information-retrieval

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

Stars: ✭ 584 (+1783.87%)

Mutual labels: information-retrieval

Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.

Stars: ✭ 399 (+1187.1%)

Mutual labels: information-retrieval

Drl4nlp.scratchpad

Notes on Deep Reinforcement Learning for Natural Language Processing papers

Stars: ✭ 26 (-16.13%)

Mutual labels: information-retrieval

RMDL: Random Multimodel Deep Learning for Classification

Stars: ✭ 375 (+1109.68%)

Mutual labels: information-retrieval

Hardware-accelerated vector-based search engine. Available as a HTTP service or as an embedded library.

Stars: ✭ 529 (+1606.45%)

Mutual labels: information-retrieval

Python Keyphrase Extraction module

Stars: ✭ 855 (+2658.06%)

Mutual labels: information-retrieval

API to let user fetch the events that happen(ed) on a specific date

Stars: ✭ 7 (-77.42%)

Mutual labels: information-retrieval

Awesome Neural Models For Semantic Match

A curated list of papers dedicated to neural text (semantic) matching.

Stars: ✭ 669 (+2058.06%)

Mutual labels: information-retrieval

View All Similar Projects ➔

NPRF

NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval [pdf]

If you use the code, please cite the following paper:

@inproceedings{li2018nprf,
  title={NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval},
  author={Li, Canjia and Sun, Yingfei and He, Ben and Wang, Le and Hui, Kai and Yates, Andrew and Sun, Le and Xu, Jungang},
  booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
  year={2018}
}

Requirement

Tensorflow
Keras
gensim
numpy

Getting started

Training data preparation

To capture the top-k terms from top-n documents, one needs to extract term document frequency from index. Afterwards, you are required to generate the similarity matrix upon the query and document given the pre-trained word embedding (e.g. word2vec). Related functions can be found in preprocess/prepare_d2d.py.

Training meta data preparation

We introduce two classes for the ease of training. The class Relevance incorporates the relevance information from the baseline and qrels file. The class Result simplify the write/read operation on standard TREC result file. Other information like query idf is dumped as a pickle file.

Model training

Configure the MODEL_config.py file, then run

python MODEL.py --fold fold_number temp_file_path

You need to run 5-fold cross valiation, which can be automatically done by running the runfold.sh script. The temp file is a temporary file to write the result of the validation set in TREC format. A training log sample on the first fold of TREC 1-3 dataset is provided for reference, see sample_log.

Evaluation

After training, the evaluation result of each fold is retained in the result path as you specify in the MODEL_config.py file. One can simply run cat *res >> merge_file to merge results from all folds. Thereafter, run the trec_eval script to evaluate your model.

Reference

Some snippets of the code follow the implementation of K-NRM, MatchZoo.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 31

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗