Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → etlundquist → Rankfm

etlundquist / Rankfm

Licence: gpl-3.0

Factorization Machines for Recommendation and Ranking Problems with Implicit Feedback Data

Programming Languages

python

139335 projects - #7 most used programming language

Labels

machine-learning recommender-system collaborative-filtering factorization-machines

Projects that are alternatives of or similar to Rankfm

Rsparse

Fast and accurate machine learning on sparse matrices - matrix factorizations, regression, classification, top-N recommendations.

Stars: ✭ 145 (+104.23%)

Mutual labels: recommender-system, collaborative-filtering, factorization-machines

Recommendation.jl

Building recommender systems in Julia

Stars: ✭ 42 (-40.85%)

Mutual labels: collaborative-filtering, recommender-system, factorization-machines

Daisyrec

A developing recommender system in pytorch. Algorithm: KNN, LFM, SLIM, NeuMF, FM, DeepFM, VAE and so on, which aims to fair comparison for recommender system benchmarks

Stars: ✭ 280 (+294.37%)

Mutual labels: recommender-system, collaborative-filtering, factorization-machines

Openlearning4deeprecsys

Some deep learning based recsys for open learning.

Stars: ✭ 383 (+439.44%)

Mutual labels: recommender-system, factorization-machines

Attentional Neural Factorization Machine

Attention,Factorization Machine, Deep Learning, Recommender System

Stars: ✭ 39 (-45.07%)

Mutual labels: recommender-system, factorization-machines

Rspapers

A Curated List of Must-read Papers on Recommender System.

Stars: ✭ 4,140 (+5730.99%)

Mutual labels: recommender-system, collaborative-filtering

Cornac

A Comparative Framework for Multimodal Recommender Systems

Stars: ✭ 308 (+333.8%)

Mutual labels: recommender-system, collaborative-filtering

Neural graph collaborative filtering

Neural Graph Collaborative Filtering, SIGIR2019

Stars: ✭ 517 (+628.17%)

Mutual labels: recommender-system, collaborative-filtering

Neural factorization machine

TenforFlow Implementation of Neural Factorization Machine

Stars: ✭ 422 (+494.37%)

Mutual labels: recommender-system, factorization-machines

Newsrecommendsystem

个性化新闻推荐系统，A news recommendation system involving collaborative filtering,content-based recommendation and hot news recommendation, can be adapted easily to be put into use in other circumstances.

Stars: ✭ 557 (+684.51%)

Mutual labels: recommender-system, collaborative-filtering

Consimilo

A Clojure library for querying large data-sets on similarity

Stars: ✭ 54 (-23.94%)

Mutual labels: recommender-system, collaborative-filtering

Recoder

Large scale training of factorization models for Collaborative Filtering with PyTorch

Stars: ✭ 46 (-35.21%)

Mutual labels: recommender-system, collaborative-filtering

Collaborative Deep Learning For Recommender Systems

The hybrid model combining stacked denoising autoencoder with matrix factorization is applied, to predict the customer purchase behavior in the future month according to the purchase history and user information in the Santander dataset.

Stars: ✭ 60 (-15.49%)

Mutual labels: recommender-system, collaborative-filtering

Attentional factorization machine

TensorFlow Implementation of Attentional Factorization Machine

Stars: ✭ 362 (+409.86%)

Mutual labels: recommender-system, factorization-machines

Recommendation Systems Paperlist

Papers about recommendation systems that I am interested in

Stars: ✭ 308 (+333.8%)

Mutual labels: recommender-system, collaborative-filtering

Pytorch Fm

Factorization Machine models in PyTorch

Stars: ✭ 455 (+540.85%)

Mutual labels: collaborative-filtering, factorization-machines

Fastfm

fastFM: A Library for Factorization Machines

Stars: ✭ 908 (+1178.87%)

Mutual labels: recommender-system, factorization-machines

Recsys19 hybridsvd

Accompanying code for reproducing experiments from the HybridSVD paper. Preprint is available at https://arxiv.org/abs/1802.06398.

Stars: ✭ 23 (-67.61%)

Mutual labels: recommender-system, collaborative-filtering

Deepmatch

A deep matching model library for recommendations & advertising. It's easy to train models and to export representation vectors which can be used for ANN search.

Stars: ✭ 1,051 (+1380.28%)

Mutual labels: collaborative-filtering, factorization-machines

Summary Of Recommender System Papers

阅读过的推荐系统论文的归类总结，持续更新中…

Stars: ✭ 288 (+305.63%)

Mutual labels: recommender-system, collaborative-filtering

View All Similar Projects ➔

RankFM

RankFM is a python implementation of the general Factorization Machines model class adapted for collaborative filtering recommendation/ranking problems with implicit feedback user/item interaction data. It uses Bayesian Personalized Ranking (BPR) and a variant of Weighted Approximate-Rank Pairwise (WARP) loss to learn model weights via Stochastic Gradient Descent (SGD). It can (optionally) incorporate sample weights and user/item auxiliary features to augment the main interaction data.

The core (training, prediction, recommendation) methods are written in Cython, making it possible to scale to millions of user/item interactions. Designed for ease-of-use, RankFM accepts both pd.DataFrame and np.ndarray inputs - you do not have to convert your data to scipy.sparse matrices or re-map user/item identifiers prior to use. RankFM internally maps all user/item identifiers to zero-based integer indexes, but always converts its output back to the original user/item identifiers from your data, which can be arbitrary (non-zero-based, non-consecutive) integers or even strings.

In addition to the familiar fit(), predict(), recommend() methods, RankFM includes additional utilities similiar_users() and similar_items() to find the most similar users/items to a given user/item based on latent factor space embeddings. A number of popular recommendation/ranking evaluation metric functions have been included in the separate evaluation module to streamline model tuning and validation.

see the Quickstart section below to get started with the basic functionality
see the /examples folder for more in-depth jupyter notebook walkthroughs with several popular open-source data sets
see the Online Documentation for more comprehensive documentation on the main model class and separate evaluation module
see the Medium Article for contextual motivation and a detailed mathematical description of the algorithm

Dependencies

Python 3.6+
numpy >= 1.15
pandas >= 0.24

Installation

Prerequisites

To install RankFM's C extensions you will need the GNU Compiler Collection (GCC). Check to see whether you already have it installed:

gcc --version

If you don't have it already you can easily install it using Homebrew on OSX or your default linux package manager:

# OSX
brew install gcc

# linux
sudo yum install gcc

# ensure [gcc] has been installed correctly and is on the system PATH
gcc --version

Package Installation

You can install the latest published version from PyPI using pip:

pip install rankfm

Or alternatively install the current development build directly from GitHub:

pip install git+https://github.com/etlundquist/rankfm.git#egg=rankfm

It's highly recommended that you use an Anaconda base environment to ensure that all core numpy C extensions and linear algebra libraries have been installed and configured correctly. Anaconda: it just works.

Quickstart

Let's work through a simple example of fitting a model, generating recommendations, evaluating performance, and assessing some item-item similarities. The data we'll be using here may already be somewhat familiar: you know it, you love it, it's the MovieLens 1M!

Let's first look at the required shape of the interaction data:

user_id	item_id
3	233
5	377
8	610

It has just two columns: a user_id and an item_id (you can name these fields whatever you want or use a numpy array instead). Notice that there is no rating column - this library is for implicit feedback data (e.g. watches, page views, purchases, clicks) as opposed to explicit feedback data (e.g. 1-5 ratings, thumbs up/down). Implicit feedback is far more common in real-world recommendation contexts and doesn't suffer from the missing-not-at-random problem of pure explicit feedback approaches.

Now let's import the library, initialize our model, and fit on the training data:

from rankfm.rankfm import RankFM
model = RankFM(factors=20, loss='warp', max_samples=20, alpha=0.01, sigma=0.1, learning_rate=0.1, learning_schedule='invscaling')
model.fit(interactions_train, epochs=20, verbose=True)
# NOTE: this takes about 30 seconds for 750,000 interactions on my 2.3 GHz i5 8GB RAM MacBook

If you set verbose=True the model will print the current epoch number as well as the epoch's log-likelihood during training. This can be useful to gauge both computational speed and training gains by epoch. If the log likelihood is not increasing then try upping the learning_rate or lowering the (alpha, beta) regularization strength terms. If the log likelihood is starting to bounce up and down try lowering the learning_rate or using learning_schedule='invscaling' to decrease the learning rate over time. If you run into overflow errors then decrease the feature and/or sample-weight magnitudes and try upping beta, especially if you have a small number of dense user-features and/or item-features. Selecting BPR loss will lead to faster training times, but WARP loss typically yields superior model performance.

Now let's generate some user-item model scores from the validation data:

valid_scores = model.predict(interactions_valid, cold_start='nan')

this will produce an array of real-valued model scores generated using the Factorization Machines model equation. You can interpret it as a measure of the predicted utility of item (i) for user (u). The cold_start='nan' option can be used to set scores to np.nan for user/item pairs not found in the training data, or cold_start='drop' can be specified to drop those pairs so the results contain no missing values.

Now let's generate our topN recommended movies for each user:

valid_recs = model.recommend(valid_users, n_items=10, filter_previous=True, cold_start='drop')

The input should be a pd.Series, np.ndarray or list of user_id values. You can use filter_previous=True to prevent generating recommendations that include any items observed by the user in the training data, which could be useful depending on your application context. The result will be a pd.DataFrame where user_id values will be the index and the rows will be each user's top recommended items in descending order (best item is in column 0):

	0	1	2	3	4	5	6	7	8	9
3	2396	1265	357	34	2858	3175	1	2028	17	356
5	608	1617	1610	3418	590	474	858	377	924	1036
8	589	1036	2571	2028	2000	1220	1197	110	780	1954

Now let's see how the model is performing wrt the included validation metrics evaluated on the hold-out data:

from rankfm.evaluation import hit_rate, reciprocal_rank, discounted_cumulative_gain, precision, recall

valid_hit_rate = hit_rate(model, interactions_valid, k=10)
valid_reciprocal_rank = reciprocal_rank(model, interactions_valid, k=10)
valid_dcg = discounted_cumulative_gain(model, interactions_valid, k=10)
valid_precision = precision(model, interactions_valid, k=10)
valid_recall = recall(model, interactions_valid, k=10)

hit_rate: 0.796
reciprocal_rank: 0.339
dcg: 0.734
precision: 0.159
recall: 0.077

That's a Bingo!

Now let's find the most similar other movies for a few movies based on their embedding representations in latent factor space:

# Terminator 2: Judgment Day (1991)
model.similar_items(589, n_items=10)

2571                       Matrix, The (1999)
1527                Fifth Element, The (1997)
2916                      Total Recall (1990)
3527                          Predator (1987)
780             Independence Day (ID4) (1996)
1909    X-Files: Fight the Future, The (1998)
733                          Rock, The (1996)
1376     Star Trek IV: The Voyage Home (1986)
480                      Jurassic Park (1993)
1200                            Aliens (1986)

I hope you like explosions...

# Being John Malkovich (1999)
model.similar_items(2997, n_items=10)

2599           Election (1999)
3174    Man on the Moon (1999)
2858    American Beauty (1999)
3317        Wonder Boys (2000)
223              Clerks (1994)
3897      Almost Famous (2000)
2395           Rushmore (1998)
2502       Office Space (1999)
2908     Boys Don't Cry (1999)
3481      High Fidelity (2000)

Let's get weird...

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 71

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗