All Projects â†’ silky â†’ deep-scite

silky / deep-scite

Licence: other
🚣 A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow

Programming Languages

HTML
75241 projects

Projects that are alternatives of or similar to deep-scite

word-embeddings-from-scratch
Creating word embeddings from scratch and visualize them on TensorBoard. Using trained embeddings in Keras.
Stars: ✭ 22 (+10%)
Mutual labels:  embeddings, tensorboard
deep-char-cnn-lstm
Deep Character CNN LSTM Encoder with Classification and Similarity Models
Stars: ✭ 20 (+0%)
Mutual labels:  embeddings
Simple chat bot
Simple nlp chatbot
Stars: ✭ 23 (+15%)
Mutual labels:  embeddings
codesnippetsearch
Neural bag of words code search implementation using PyTorch and data from the CodeSearchNet project.
Stars: ✭ 67 (+235%)
Mutual labels:  embeddings
graphml-tutorials
Tutorials for Machine Learning on Graphs
Stars: ✭ 125 (+525%)
Mutual labels:  embeddings
Archived-SANSA-ML
SANSA Machine Learning Layer
Stars: ✭ 39 (+95%)
Mutual labels:  embeddings
TCE
This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (+155%)
Mutual labels:  embeddings
VarCLR
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning
Stars: ✭ 30 (+50%)
Mutual labels:  embeddings
mloperator
Machine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (+325%)
Mutual labels:  tensorboard
flor
FLOR: Fast Low-Overhead Recovery. FLOR lets you log ML training data post-hoc, with hindsight.
Stars: ✭ 123 (+515%)
Mutual labels:  tensorboard
info-retrieval
Information Retrieval in High Dimensional Data (class deliverables)
Stars: ✭ 33 (+65%)
Mutual labels:  embeddings
RadiologyReportEmbedding
Intelligent Word Embeddings of Free-Text Radiology Reports
Stars: ✭ 22 (+10%)
Mutual labels:  embeddings
datastories-semeval2017-task6
Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (+0%)
Mutual labels:  embeddings
spark-convolution-patch
Convolution and other super-patches (blur, sharpen)
Stars: ✭ 74 (+270%)
Mutual labels:  convolution
Deep-Learning-Experiments-implemented-using-Google-Colab
Colab Compatible FastAI notebooks for NLP and Computer Vision Datasets
Stars: ✭ 16 (-20%)
Mutual labels:  embeddings
labml
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
Stars: ✭ 1,213 (+5965%)
Mutual labels:  tensorboard
tfsum
Enable TensorBoard for TensorFlow Go API
Stars: ✭ 32 (+60%)
Mutual labels:  tensorboard
navec
Compact high quality word embeddings for Russian language
Stars: ✭ 118 (+490%)
Mutual labels:  embeddings
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+4005%)
Mutual labels:  embeddings
minirocket
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification
Stars: ✭ 166 (+730%)
Mutual labels:  convolution

DeepScite - A Simple Convolutional-based Recommendation Model

Ocean Credit: https://www.flickr.com/photos/radhika_bhagwat/

Overview

DeepScite takes in papers (titles, abstracts) and emits recommendations on whether or not they should be scited by the particular users whose data we've used for training (in the case of this repo, it is me).

As output, it also gives a "goodness" score for each word; when this number is high, it has contributed strongly to the paper being (recommended) for sciting, when it is negative, it has contributed strongly to the paper not being recommended.

Below are some example outputs of the system:

The blue text are those words which are "good", and the red text are those which are "bad".

Installation

  1. Clone this repository:
git clone https://github.com/silky/deep-scite.git
  1. Use conda or (virtualenv) and create an environment that has Python 3.5.

    conda create -n deep-scite python=3.5

  2. Activate the environment

    source activate deep-scite

  3. Install the requirements

pip install -r requirements.txt

  1. Install nltk language packs

In order to tokenise strings, we use the nltk package. It requires us to download some data before using it though. To do so, run:

python -c 'import nltk; nltk.download("punkt")'
  1. Install this library in develop mode

python setup.py develop

Usage

From the root directory of this project:

  1. Activate the deep-scite environment

source activate deep-scite

  1. Train the model on the noon data set, and emit recommendations

./bin/run_model.py

This will run through the steps defined in model.yaml.

  1. Open up ./data/noon/report.html in your browser and observe recommendations.

Misc

You can play around with the embedding by looking at it in TensorBoard. Run TensorBoard with:

tensorboard --logdir /tmp/tf-checkpoints/deepscite-noon

Then click on the "Embedding" tab.

![](images/embedding.gif)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].