All Projects → wuliwei9278 → SSE-PT

wuliwei9278 / SSE-PT

Licence: other
Codes and Datasets for paper RecSys'20 "SSE-PT: Sequential Recommendation Via Personalized Transformer" and NurIPS'19 "Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SSE-PT

SIGIR2021 Conure
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Stars: ✭ 23 (-77.67%)
Mutual labels:  transformer, recommender-system
WWW2020-grec
Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation
Stars: ✭ 17 (-83.5%)
Mutual labels:  recommender-system, sequential
Yin
The efficient and elegant JSON:API 1.1 server library for PHP
Stars: ✭ 214 (+107.77%)
Mutual labels:  transformer
deep-learning-notes
🧠👨‍💻Deep Learning Specialization • Lecture Notes • Lab Assignments
Stars: ✭ 20 (-80.58%)
Mutual labels:  regularization
Gpt2 Newstitle
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Stars: ✭ 235 (+128.16%)
Mutual labels:  transformer
Self Attention Cv
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
Stars: ✭ 209 (+102.91%)
Mutual labels:  transformer
Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+3242.72%)
Mutual labels:  transformer
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (+100%)
Mutual labels:  transformer
SparseRegression.jl
Statistical Models with Regularization in Pure Julia
Stars: ✭ 37 (-64.08%)
Mutual labels:  regularization
Posthtml
PostHTML is a tool to transform HTML/XML with JS plugins
Stars: ✭ 2,737 (+2557.28%)
Mutual labels:  transformer
Ner Bert Pytorch
PyTorch solution of named entity recognition task Using Google AI's pre-trained BERT model.
Stars: ✭ 249 (+141.75%)
Mutual labels:  transformer
Torchnlp
Easy to use NLP library built on PyTorch and TorchText
Stars: ✭ 233 (+126.21%)
Mutual labels:  transformer
Multigraph transformer
transformer, multi-graph transformer, graph, graph classification, sketch recognition, sketch classification, free-hand sketch, official code of the paper "Multi-Graph Transformer for Free-Hand Sketch Recognition"
Stars: ✭ 231 (+124.27%)
Mutual labels:  transformer
Insight
Repository for Project Insight: NLP as a Service
Stars: ✭ 246 (+138.83%)
Mutual labels:  transformer
Paddlenlp
NLP Core Library and Model Zoo based on PaddlePaddle 2.0
Stars: ✭ 212 (+105.83%)
Mutual labels:  transformer
GNN-Recommendation
毕业设计:基于图神经网络的异构图表示学习和推荐算法研究
Stars: ✭ 52 (-49.51%)
Mutual labels:  recommender-system
Sttn
[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting
Stars: ✭ 211 (+104.85%)
Mutual labels:  transformer
Jddc solution 4th
2018-JDDC大赛第4名的解决方案
Stars: ✭ 235 (+128.16%)
Mutual labels:  transformer
Relational Rnn Pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
Stars: ✭ 236 (+129.13%)
Mutual labels:  transformer
VT-UNet
[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
Stars: ✭ 151 (+46.6%)
Mutual labels:  transformer

SSE-PT: Temporal Collaborative Ranking Via Personalized Transformer

We implement our code in Tensorflow and the code is tested under a server with 40-core Intel Xeon E5-2630 v4 @ 2.20GHz CPU, 256G RAM and Nvidia GTX 1080 GPUs (with TensorFlow 1.13 and Python 3).

Datasets

The preprocessed datasets are in the data directory (e.g. data/ml1m.txt). Each line of the txt format data contains a user id and an item id, where both user id and item id are indexed from 1 consecutively. Each line represents one interaction between the user and the item. For every user, their interactions were sorted by timestamp.

Papers

Our paper has been accepted to ACM Recommender Systems Conference 2020 and selected for Best Long Paper Candidates (https://dl.acm.org/doi/10.1145/3383313.3412258) and our pre-print version is on arxiv or our ICLR borderline-rejected version https://openreview.net/forum?id=HkeuD34KPH. One can cite one of below for now:

@inproceedings{wu2020sse,
  title={SSE-PT: Sequential Recommendation Via Personalized Transformer},
  author={Wu, Liwei and Li, Shuqing and Hsieh, Cho-Jui and Sharpnack, James},
  booktitle={Fourteenth ACM Conference on Recommender Systems},
  pages={328--337},
  year={2020}
}

or

@misc{
wu2020ssept,
  title={{\{}SSE{\}}-{\{}PT{\}}: Sequential Recommendation Via Personalized Transformer},
  author={Liwei Wu and Shuqing Li and Cho-Jui Hsieh and James Sharpnack},
  year={2020},
  url={https://openreview.net/forum?id=HkeuD34KPH}
}

It is worth noting that a new regualrization technique called SSE is used. One can refer to the paper below for more details: Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers. The paper has been accepted to NeurIPS 2019. We will present the work at Vancouver, Canada. Another git repo is at https://github.com/wuliwei9278/SSE.

@article{wu2019stochastic,
  title={Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers},
  author={Wu, Liwei and Li, Shuqing and Hsieh, Cho-Jui and Sharpnack, James},
  journal={arXiv preprint arXiv:1905.10630},
  year={2019}
}

Options

The training of the SSE-PT model is handled by the main.py script that provides the following command line arguments.

--dataset            STR           Name of dataset.               Default is "ml1m".
--train_dir          STR           Train directory.               Default is "default".
--batch_size         INT           Batch size.                    Default is 128.    
--lr                 FLOAT         Learning rate.                 Default is 0.001.
--maxlen             INT           Maxmum length of sequence.     Default is 50.
--user_hidden_units  INT           Hidden units of user.          Default is 50.
--item_hidden_units  INT           Hidden units of item.          Default is 50.
--num_blocks         INT           Number of blocks.              Default is 2.
--num_epochs         INT           Number of epochs to run.       Default is 2001.
--num_heads          INT           Number of heads.               Default is 1.
--dropout_rate       FLOAT         Dropout rate value.            Default is 0.5.
--threshold_user     FLOAT         SSE probability of user.       Default is 1.0.
--threshold_item     FLOAT         SSE probability of item.       Default is 1.0.
--l2_emb             FLOAT         L2 regularization value.       Default is 0.0.
--gpu                INT           Name of GPU to use.            Default is 0.
--print_freq         INT           Print frequency of evaluation. Default is 10.
--k                  INT           Top k for NDCG and Hits.       Default is 10.

Commands

To train our model on the default ml1m data with default parameters:

python3 main.py

To train a SSE-PT model on ml1m data using a maxlen of 200, a dropout rate of 0.2, a SSE probability of 0.92 for user side and a SSE probability of 0.1 for item side.

python3 main.py --maxlen=200 --dropout_rate 0.2 --threshold_user 0.08 --threshold_item 0.9

Results

The following is the plot of NDCG@10 versus training time (Seconds) for SASRec, SSE-PT and SSE-PT++. Our proposed SSE-PT and SSE-PT++ outperform SASRec.

Acknowledgements

We based our codes on SASRec.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].