Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

A developing recommender system in pytorch. Algorithm: KNN, LFM, SLIM, NeuMF, FM, DeepFM, VAE and so on, which aims to fair comparison for recommender system benchmarks

Stars: ✭ 280 (+418.52%)

Mutual labels: recommender-system, collaborative-filtering

Recommender-Systems-with-Collaborative-Filtering-and-Deep-Learning-Techniques

Implemented User Based and Item based Recommendation System along with state of the art Deep Learning Techniques

Stars: ✭ 41 (-24.07%)

Mutual labels: collaborative-filtering, recommender-system

Neural graph collaborative filtering

Neural Graph Collaborative Filtering, SIGIR2019

Stars: ✭ 517 (+857.41%)

Mutual labels: recommender-system, collaborative-filtering

recsys spark

Spark SQL 实现 ItemCF，UserCF，Swing，推荐系统，推荐算法，协同过滤

Stars: ✭ 76 (+40.74%)

Mutual labels: collaborative-filtering, recommender-system

Recsys2019 deeplearning evaluation

This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.

Stars: ✭ 780 (+1344.44%)

Mutual labels: recommender-system, collaborative-filtering

Awesome-Machine-Learning-Papers

📖Notes and remarks on Machine Learning related papers

Stars: ✭ 35 (-35.19%)

Mutual labels: collaborative-filtering, recommender-system

Rspapers

A Curated List of Must-read Papers on Recommender System.

Stars: ✭ 4,140 (+7566.67%)

Mutual labels: recommender-system, collaborative-filtering

Recommendation Systems Paperlist

Papers about recommendation systems that I am interested in

Stars: ✭ 308 (+470.37%)

Mutual labels: recommender-system, collaborative-filtering

BARS

Towards open benchmarking for recommender systems https://openbenchmark.github.io/BARS

Stars: ✭ 157 (+190.74%)

Mutual labels: collaborative-filtering, recommender-system

Recoder

Large scale training of factorization models for Collaborative Filtering with PyTorch

Stars: ✭ 46 (-14.81%)

Mutual labels: recommender-system, collaborative-filtering

Recommender-System

In this code we implement and compared Collaborative Filtering algorithm, prediction algorithms such as neighborhood methods, matrix factorization-based ( SVD, PMF, SVD++, NMF), and many others.

Stars: ✭ 30 (-44.44%)

Mutual labels: collaborative-filtering, recommender-system

recommender

NReco Recommender is a .NET port of Apache Mahout CF java engine (standalone, non-Hadoop version)

Stars: ✭ 35 (-35.19%)

Mutual labels: collaborative-filtering, recommender-system

BPR MPR

BPR, Bayesian Personalized Ranking (BPR), extremely convenient BPR & Multiple Pairwise Ranking

Stars: ✭ 77 (+42.59%)

Mutual labels: collaborative-filtering, recommender-system

Summary Of Recommender System Papers

阅读过的推荐系统论文的归类总结，持续更新中…

Stars: ✭ 288 (+433.33%)

Mutual labels: recommender-system, collaborative-filtering

RecSys PyTorch

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Stars: ✭ 125 (+131.48%)

Mutual labels: collaborative-filtering, recommender-system

TIFUKNN

kNN-based next-basket recommendation

Stars: ✭ 38 (-29.63%)

Mutual labels: collaborative-filtering, recommender-system

Cornac

A Comparative Framework for Multimodal Recommender Systems

Stars: ✭ 308 (+470.37%)

Mutual labels: recommender-system, collaborative-filtering

Newsrecommendsystem

个性化新闻推荐系统，A news recommendation system involving collaborative filtering,content-based recommendation and hot news recommendation, can be adapted easily to be put into use in other circumstances.

Stars: ✭ 557 (+931.48%)

Mutual labels: recommender-system, collaborative-filtering

View All Similar Projects ➔

consimilo

A Clojure library for querying large data-sets on similarity

consimilo is a library that utilizes locality sensitive hashing (implemented as lsh-forest) and minhashing, to support top-k similar item queries. Finding similar items across expansive data-sets is a common problem that presents itself in many real world applications (e.g. finding articles from the same source, plagiarism detection, collaborative filtering, context filtering, document similarity, etc...). Searching a corpus for top-k similar items quickly grows to an unwieldy complexity at relatively small corpus sizes (n choose 2). LSH reduces the search space by "hashing" items in such a way that collisions occur as a result of similarity. Once the items are hashed and indexed the lsh-forest supports a top-k most similar items query of ~O(log n). There is an accuracy trade-off that comes with the enormous increase in query speed. More information can be found in chapter 3 of Mining Massive Datasets.

Getting Started

Add consimilo as a dependency in your project.clj:

[consimilo "0.1.1"]

The main methods you are likely to need are all located in core.clj. Import it with something like:

(ns my-ns (:require [consimilo.core :as consimilo]))

Building a forest

First you need to load the candidates vector into a forest. This vector can represent any arbitrary information (e.g. tokens in a document, ngrams, metadata about users, content interactions, context surrounding interactions). The candidates vector must be a collection of maps, each representing an item. The map will have an :id key which is used to reference the minhash vector in the forest and a :features key which is a vector containing the individual features. [{:id id1 :features [feature1 feature2 ... featuren]} ... ]

Adding feature vectors to a forest

Once your candidates vector is in the correct form, you can add the items to the forest:

(def my-forest (consimilo/add-all-to-forest candidates-vector))           ;;creates new forest, my-forest

You can continue to add to this forest by passing it as the first argument to add-all-to-forest. The forest data structure is stored in an atom, so the existing forest is modified in place.

Note: upon every call to add-all-to-forest an expensive sort function is called to enable O(log n) queries. It is better to add all items to the forest at once or in the case of a live system, add new items to the forest in batches offline and replace the production forest.

(consimilo/add-all-to-forest my-forest new-candidates-vector)             ;;updates my-forest in place

Adding strings and files to a forest (helper functions)

consimilo provides helper functions for constructing feature vectors from strings and files. By default, a new forest is created and stopwords are removed. You may add to an existing forest and/or include stopwords via optional parameters :forest :remove-stopwords?. The optional parameters are defaulted to :forest (new-forest) :remove-stopwords? true.

Add a collection of strings to a new forest and remove stopwords:

(def my-forest (consimilo/add-strings-to-forest
                 [{:id id1 :features "my sample string 1"}
                  {:id id2 :features "my sample string 2"}]))

Add a collection of strings to an existing forest and do not remove stopwords:

(consimilo/add-strings-to-forest [{:id id1 :features "my sample string 1"}
                                  {:id id2 :features "my sample string 2"}]
                                 :forest my-forest))               ;;updates my-forest in place

Add a collection of files to a new forest and remove stopwords:

(def my-forest (consimilo/add-files-to-forest
                 [FileObj-1 FileObj-2 FileObj-3 FileObj-n]))              ;;creates new forest, my-forest

Note: when calling add-files-to-forest :id is auto-generated from the file name and :features are generated from the tokenized, extracted text. The same optional parameters available for add-strings-to-forest are also available for add-files-to-forest.

Querying a forest

Once you have your forest built, you can query for k most similar items to feature-vector v by running:

(def results (consimilo/query-forest my-forest k v))

(println (:top-k results)) ;;returns a list of keys ordered by similarity
(println (:query-hash results)) ;;returns the minhash of the query. Utilized to calculate similarity.

Querying a forest with strings and files (helper functions)

consimilo provides helper functions for querying the forest with strings and files. The helper functions query-string and query-file have an optional parameter :remove-stopwords? which is defaulted true, removing stopwords. Queries against strings and files should be made using the same tokenization scheme used to input items in the forest (stopwords present or removed).

Querying a forest with a string:

(def results (consimilo/query-string my-forest k "my query string"))

(println (:top-k results)) ;;returns a list of keys ordered by similarity
(println (:query-hash results)) ;;returns the minhash of the query. Utilized to calculate similarity.

Querying a forest with a file:

(def results (consimilo/query-file my-forest k Fileobj))

(println (:top-k results)) ;;returns a list of keys ordered by similarity
(println (:query-hash results)) ;;returns the minhash of the query. Utilized to calculate similarity.

Calculating similarity

consimilo provides functions for calculating approximate distance / similarity between the query and top-k results. The function similar-k accepts optional parameters to specify which distance / similarity function should be used. For calculating Jaccard similarity, use: :sim-fn :jaccard, for calculating Hamming distance, use: :sim-fn :hamming, and for calculating cosine distance, use: :sim-fn :cosine. similar-k returns a hash-map, keys are the top-k ids and vals are the similarity / distance. As with the other query functions, queries against strings and files should be made using the same tokenization scheme used to input the items in the forest (stopwords present or removed).

Querying a forest with strings, files, or feature-vectors and calculating similarity

consimilo will dispatch to the correct query function based on query type (string, file, collection of features). There are 3 similarity functions available for use: :consine, jaccard, & hamming.

(def similar-items (consimilo/similarity-k 
                     my-forest
                     k
                     query
                     :sim-fn :cosine))

(println similar-items) ;;{id1 (cosine-distance(query id1)) ... idk (cosine-distance (query idk))}

Saving and loading forests

consimilo uses Nippy to provide simple, robust, serialization / deserialization of your forests.

Serialize and save a forest to a file:

(consimilo/freeze-forest my-forest "/tmp/my-saved-forest")

Load a forest from a file:

(def my-forest (consimilo/thaw-forest "/tmp/my-saved-forest"))

Configuration

consimilo uses config to manage configuration. consimilo has three configurable options:

Number of trees in the forest (default 8): :trees
Number of permutation functions used to build the minhash (default 128): :perms
Random number seed used to generate minhash functions (default 1) :seed

The defaults should work well in most cases, however they may be overridden by placing a config.edn file in the resources directory of your project. See config.edn.

Contributions / Issues

Please use the project's GitHub issues page for questions, ideas, etc. Pull requests are welcome.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 54

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗