wuliwei9278 / SSE-PT

Licence: other

Codes and Datasets for paper RecSys'20 "SSE-PT: Sequential Recommendation Via Personalized Transformer" and NurIPS'19 "Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers"

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SSE-PT

SIGIR2021 Conure

One Person, One Model, One World: Learning Continual User Representation without Forgetting

Stars: ✭ 23 (-77.67%)

Mutual labels: transformer, recommender-system

WWW2020-grec

Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation

Stars: ✭ 17 (-83.5%)

Mutual labels: recommender-system, sequential

Yin

The efficient and elegant JSON:API 1.1 server library for PHP

Stars: ✭ 214 (+107.77%)

Mutual labels: transformer

deep-learning-notes

🧠👨‍💻Deep Learning Specialization • Lecture Notes • Lab Assignments

Stars: ✭ 20 (-80.58%)

Mutual labels: regularization

Gpt2 Newstitle

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

Stars: ✭ 235 (+128.16%)

Mutual labels: transformer

Self Attention Cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Stars: ✭ 209 (+102.91%)

Mutual labels: transformer

Bertviz

Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)

Stars: ✭ 3,443 (+3242.72%)

Mutual labels: transformer

Hardware Aware Transformers

[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Stars: ✭ 206 (+100%)

Mutual labels: transformer

SparseRegression.jl

Statistical Models with Regularization in Pure Julia

Stars: ✭ 37 (-64.08%)

Mutual labels: regularization

Posthtml

PostHTML is a tool to transform HTML/XML with JS plugins

Stars: ✭ 2,737 (+2557.28%)

Mutual labels: transformer

Ner Bert Pytorch

PyTorch solution of named entity recognition task Using Google AI's pre-trained BERT model.

Stars: ✭ 249 (+141.75%)

Mutual labels: transformer

Torchnlp

Easy to use NLP library built on PyTorch and TorchText

Stars: ✭ 233 (+126.21%)

Mutual labels: transformer

Multigraph transformer

transformer, multi-graph transformer, graph, graph classification, sketch recognition, sketch classification, free-hand sketch, official code of the paper "Multi-Graph Transformer for Free-Hand Sketch Recognition"

Stars: ✭ 231 (+124.27%)

Mutual labels: transformer

Insight

Repository for Project Insight: NLP as a Service

Stars: ✭ 246 (+138.83%)

Mutual labels: transformer

Paddlenlp

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

Stars: ✭ 212 (+105.83%)

Mutual labels: transformer

GNN-Recommendation

毕业设计：基于图神经网络的异构图表示学习和推荐算法研究

Stars: ✭ 52 (-49.51%)

Mutual labels: recommender-system

Sttn

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting

Stars: ✭ 211 (+104.85%)

Mutual labels: transformer

Jddc solution 4th

2018-JDDC大赛第4名的解决方案

Stars: ✭ 235 (+128.16%)

Mutual labels: transformer

Relational Rnn Pytorch

An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

Stars: ✭ 236 (+129.13%)

Mutual labels: transformer

VT-UNet

[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation

Stars: ✭ 151 (+46.6%)

Mutual labels: transformer

View All Similar Projects ➔

SSE-PT: Temporal Collaborative Ranking Via Personalized Transformer

We implement our code in Tensorflow and the code is tested under a server with 40-core Intel Xeon E5-2630 v4 @ 2.20GHz CPU, 256G RAM and Nvidia GTX 1080 GPUs (with TensorFlow 1.13 and Python 3).

Datasets

The preprocessed datasets are in the data directory (e.g. data/ml1m.txt). Each line of the txt format data contains a user id and an item id, where both user id and item id are indexed from 1 consecutively. Each line represents one interaction between the user and the item. For every user, their interactions were sorted by timestamp.

Papers

Our paper has been accepted to ACM Recommender Systems Conference 2020 and selected for Best Long Paper Candidates (https://dl.acm.org/doi/10.1145/3383313.3412258) and our pre-print version is on arxiv or our ICLR borderline-rejected version https://openreview.net/forum?id=HkeuD34KPH. One can cite one of below for now:

@inproceedings{wu2020sse,
  title={SSE-PT: Sequential Recommendation Via Personalized Transformer},
  author={Wu, Liwei and Li, Shuqing and Hsieh, Cho-Jui and Sharpnack, James},
  booktitle={Fourteenth ACM Conference on Recommender Systems},
  pages={328--337},
  year={2020}
}

@misc{
wu2020ssept,
  title={{\{}SSE{\}}-{\{}PT{\}}: Sequential Recommendation Via Personalized Transformer},
  author={Liwei Wu and Shuqing Li and Cho-Jui Hsieh and James Sharpnack},
  year={2020},
  url={https://openreview.net/forum?id=HkeuD34KPH}
}

It is worth noting that a new regualrization technique called SSE is used. One can refer to the paper below for more details: Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers. The paper has been accepted to NeurIPS 2019. We will present the work at Vancouver, Canada. Another git repo is at https://github.com/wuliwei9278/SSE.

@article{wu2019stochastic,
  title={Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers},
  author={Wu, Liwei and Li, Shuqing and Hsieh, Cho-Jui and Sharpnack, James},
  journal={arXiv preprint arXiv:1905.10630},
  year={2019}
}

Options

The training of the SSE-PT model is handled by the main.py script that provides the following command line arguments.

--dataset            STR           Name of dataset.               Default is "ml1m".
--train_dir          STR           Train directory.               Default is "default".
--batch_size         INT           Batch size.                    Default is 128.    
--lr                 FLOAT         Learning rate.                 Default is 0.001.
--maxlen             INT           Maxmum length of sequence.     Default is 50.
--user_hidden_units  INT           Hidden units of user.          Default is 50.
--item_hidden_units  INT           Hidden units of item.          Default is 50.
--num_blocks         INT           Number of blocks.              Default is 2.
--num_epochs         INT           Number of epochs to run.       Default is 2001.
--num_heads          INT           Number of heads.               Default is 1.
--dropout_rate       FLOAT         Dropout rate value.            Default is 0.5.
--threshold_user     FLOAT         SSE probability of user.       Default is 1.0.
--threshold_item     FLOAT         SSE probability of item.       Default is 1.0.
--l2_emb             FLOAT         L2 regularization value.       Default is 0.0.
--gpu                INT           Name of GPU to use.            Default is 0.
--print_freq         INT           Print frequency of evaluation. Default is 10.
--k                  INT           Top k for NDCG and Hits.       Default is 10.

Commands

To train our model on the default ml1m data with default parameters:

python3 main.py

To train a SSE-PT model on ml1m data using a maxlen of 200, a dropout rate of 0.2, a SSE probability of 0.92 for user side and a SSE probability of 0.1 for item side.

python3 main.py --maxlen=200 --dropout_rate 0.2 --threshold_user 0.08 --threshold_item 0.9

Results

The following is the plot of NDCG@10 versus training time (Seconds) for SASRec, SSE-PT and SSE-PT++. Our proposed SSE-PT and SSE-PT++ outperform SASRec.

Acknowledgements

We based our codes on SASRec.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

wuliwei9278 / SSE-PT

Programming Languages

Labels

Projects that are alternatives of or similar to SSE-PT

SSE-PT: Temporal Collaborative Ranking Via Personalized Transformer

Datasets

Papers

Options

Commands

Results

Acknowledgements