All Projects → yym-ustc → FactorizationMachine

yym-ustc / FactorizationMachine

Licence: other
implementation of factorization machine, support classification.

Programming Languages

C++
36643 projects - #6 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to FactorizationMachine

a-tour-of-pytorch-optimizers
A tour of different optimization algorithms in PyTorch.
Stars: ✭ 46 (+142.11%)
Mutual labels:  sgd, adagrad
Neural-Factorization-Machine
Factorization Machine, Deep Learning, Recommender System
Stars: ✭ 20 (+5.26%)
Mutual labels:  factorization-machine
variants-of-rmsprop-and-adagrad
SC-Adagrad, SC-RMSProp and RMSProp algorithms for training deep networks proposed in
Stars: ✭ 14 (-26.32%)
Mutual labels:  adagrad
LinkOS-Android-Samples
Java based sample code for developing on Android. The demos in this repository are stored on separate branches. To navigate to a demo, please click branches.
Stars: ✭ 52 (+173.68%)
Mutual labels:  sgd
theedhum-nandrum
A sentiment classifier on mixed language (and mixed script) reviews in Tamil, Malayalam and English
Stars: ✭ 16 (-15.79%)
Mutual labels:  sgd
AutoOpt
Automatic and Simultaneous Adjustment of Learning Rate and Momentum for Stochastic Gradient Descent
Stars: ✭ 44 (+131.58%)
Mutual labels:  sgd
Awd Lstm Lm
LSTM and QRNN Language Model Toolkit for PyTorch
Stars: ✭ 1,834 (+9552.63%)
Mutual labels:  sgd
DiFacto2 ffm
Distributed Fieldaware Factorization Machines based on Parameter Server
Stars: ✭ 11 (-42.11%)
Mutual labels:  sgd
TransE
TransE方法的Python实现,解释SGD中TransE的向量更新
Stars: ✭ 31 (+63.16%)
Mutual labels:  sgd
batchnorm-pruning
Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers https://arxiv.org/abs/1802.00124
Stars: ✭ 66 (+247.37%)
Mutual labels:  sgd
SGDLibrary
MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
Stars: ✭ 165 (+768.42%)
Mutual labels:  sgd
numpy-neuralnet-exercise
Implementation of key concepts of neuralnetwork via numpy
Stars: ✭ 49 (+157.89%)
Mutual labels:  sgd
Tensorflow Deepfm
Tensorflow implementation of DeepFM for CTR prediction.
Stars: ✭ 1,891 (+9852.63%)
Mutual labels:  factorization-machine
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-15.79%)
Mutual labels:  factorization-machine
DeepLight Deep-Lightweight-Feature-Interactions
Accelerating Inference for Recommendation Systems (WSDM'21)
Stars: ✭ 100 (+426.32%)
Mutual labels:  factorization-machine
Table of Contents
=================

- What is LIBFM
- Installation
- Data Format
- Command Line Usage
- Examples
- OpenMP and SSE
- Building Windows Binaries
- FAQ


What is LIBFM
==============

This implementation of factorization machine is based on LibFFM(by Quqin Ruan). We update the data structure of the model parameters, and the new structure support several optimization methods well(Adagrad, FTRL, etc.).

Installation
============

Requirement: It requires a C++11 compatible compiler. We also use OpenMP to provide multi-threading. If OpenMP is not
available on your platform, please refer to section `OpenMP and SSE.'

- Unix-like systems:
  Typeype `make' in the command line.

- Windows:
  See `Building Windows Binaries' to compile.



Data Format
===========

The data format of LIBFM is:

<label> <feature1>:<value1> <feature2>:<value2> ...

`feature' should be non-negative integers.


Command Line Usage
==================

-   `fm-train'

    usage: fm-train [options] training_set_file [model_file]

    options:
    -l <lambda>: set regularization parameter (default 0.00002)
    -k <factor>: set number of latent factors (default 4)
    -t <iteration>: set number of iterations (default 15)
    -r <eta>: set learning rate (default 0.2)
    -s <nr_threads>: set number of threads (default 1)
    -p <path>: set path to the validation set
    --quiet: quiet model (no output)
    --no-norm: disable instance-wise normalization
    --auto-stop: stop at the iteration that achieves the best validation loss (must be used with -p)

    By default we do instance-wise normalization. That is, we normalize the 2-norm of each instance to 1. You can use
    `--no-norm' to disable this function.
    
    A binary file `training_set_file.bin' will be generated to store the data in binary format.

-   `fm-predict'

    usage: fm-predict test_file model_file output_file



Examples
========

> ./fm-train -p validate_data train_data model

train a model using the default parameters


> ./fm-predict predict_data model predict_ans

do prediction


> ./fm-train -l 0.0001 -k 15 -t 30 -r 0.05 -s 4 --auto-stop -p validate_data train_data model

train a model using the following parameters:

    regularization cost = 0.0001
    latent factors = 15
    iterations = 30
    learning rate = 0.3
    threads = 4
    let it auto-stop


OpenMP and SSE
==============

We use OpenMP to do parallelization. If OpenMP is not available on your
platform, then please comment out the following lines in Makefile.

    DFLAG += -DUSEOMP
    CXXFLAGS += -fopenmp

Note: Please run `make clean all' if these flags are changed.

We use SSE instructions to perform fast computation. If you do not want to use it, comment out the following line:

    DFLAG += -DUSESSE

Then, run `make clean all'



Building Windows Binaries
=========================

The Windows part is maintained by different maintainer, so it may not always support the latest version.

The latest version it supports is: v1.21

To build them via command-line tools of Visual C++, use the following steps:

1. Open a DOS command box (or Developer Command Prompt for Visual Studio) and go to LIBFFM directory. If environment
variables of VC++ have not been set, type

"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\vcvars64.bat"

You may have to modify the above command according which version of VC++ or
where it is installed.

2. Type

nmake -f Makefile.win clean all


FAQ
===

Q: Why I have the same model size when k = 1 and k = 4?

A: This is because we use SSE instructions. In order to use SSE, the memory need to be aligned. So even you assign k =
   1, we still fill some dummy zeros from k = 2 to 4.


Q: Why the logloss is slightly different on the same data when I run the program two or more times when I use multi-threading

A: When there are more then one thread, the program becomes non-deterministic. To make it deterministic you can only use one thread.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].