Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → chengxuanying → Kdd Multimodalities Recall

chengxuanying / Kdd Multimodalities Recall

This is our solution for KDD Cup 2020. We implemented a very neat and simple neural ranking model based on siamese BERT which ranked first among the solo teams and ranked 12th among all teams on the final leaderboard.

Labels

jupyter-notebook e-commerce recommendation-system

Projects that are alternatives of or similar to Kdd Multimodalities Recall

Ajax Movie Recommendation System With Sentiment Analysis

Content-Based Recommender System recommends movies similar to the movie user likes and analyses the sentiments on the reviews given by the user for that movie.

Stars: ✭ 127 (+108.2%)

Mutual labels: jupyter-notebook, recommendation-system

Mydatascienceportfolio

Applying Data Science and Machine Learning to Solve Real World Business Problems

Stars: ✭ 227 (+272.13%)

Mutual labels: jupyter-notebook, recommendation-system

Artificial Intelligence Projects

Collection of Artificial Intelligence projects.

Stars: ✭ 152 (+149.18%)

Mutual labels: jupyter-notebook, recommendation-system

Recommenders

Best Practices on Recommendation Systems

Stars: ✭ 11,818 (+19273.77%)

Mutual labels: jupyter-notebook, recommendation-system

Reco Gym

Code for reco-gym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising

Stars: ✭ 314 (+414.75%)

Mutual labels: jupyter-notebook, recommendation-system

Newsrecommender

A news recommendation system tailored for user communities

Stars: ✭ 164 (+168.85%)

Mutual labels: jupyter-notebook, recommendation-system

Tutorials

AI-related tutorials. Access any of them for free → https://towardsai.net/editorial

Stars: ✭ 204 (+234.43%)

Mutual labels: jupyter-notebook, recommendation-system

Recsys core

[电影推荐系统] Based on the movie scoring data set, the movie recommendation system is built with FM and LR as the core(基于爬取的电影评分数据集，构建以FM和LR为核心的电影推荐系统).

Stars: ✭ 245 (+301.64%)

Mutual labels: jupyter-notebook, recommendation-system

Recsys

项亮的《推荐系统实践》的代码实现

Stars: ✭ 306 (+401.64%)

Mutual labels: jupyter-notebook, recommendation-system

Kdd winniethebest

KDD Cup 2020 Challenges for Modern E-Commerce Platform: Multimodalities Recall first place

Stars: ✭ 142 (+132.79%)

Mutual labels: e-commerce, jupyter-notebook

Recommendation System Practice Notes

《推荐系统实践》代码与读书笔记，在线阅读地址：https://relph1119.github.io/recommendation-system-practice-notes

Stars: ✭ 22 (-63.93%)

Mutual labels: jupyter-notebook, recommendation-system

Deep Recommender System

深度学习在推荐系统中的应用及论文小结。

Stars: ✭ 657 (+977.05%)

Mutual labels: jupyter-notebook, recommendation-system

Drugs Recommendation Using Reviews

Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.

Stars: ✭ 35 (-42.62%)

Mutual labels: jupyter-notebook, recommendation-system

Pycon Ua 2018

Talk at PyCon UA 2018 (Kharkov, Ukraine)

Stars: ✭ 60 (-1.64%)

Mutual labels: jupyter-notebook

Data Science Cookbook

🎓 Jupyter notebooks from UFC data science course

Stars: ✭ 60 (-1.64%)

Mutual labels: jupyter-notebook

Ml Dl Projects

Personal projects using machine learning and deep learning techniques

Stars: ✭ 60 (-1.64%)

Mutual labels: jupyter-notebook

Data scientist nanodegree

Stars: ✭ 59 (-3.28%)

Mutual labels: jupyter-notebook

Insightface pytorch

Pytorch0.4.1 codes for InsightFace

Stars: ✭ 1,109 (+1718.03%)

Mutual labels: jupyter-notebook

Rl Cc

Web-based Reinforcement Learning Control Center

Stars: ✭ 60 (-1.64%)

Mutual labels: jupyter-notebook

Kaggle challenge live

This is the code for "Kaggle Challenge (LIVE)" by Siraj Raval on Youtube

Stars: ✭ 60 (-1.64%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

KDD-Multimodalities-Recall

This is our solution for KDD Cup 2020. We implemented a very neat and simple neural ranking model based on siamese BERT[1] which ranked FIRST among the solo teams and ranked 12th among all teams on the final leaderboard.

Related Project: WSDM-Adhoc-Document-Retrieval

Features

An end-to-end system with zero feature engineering.
Implemented the model using only 36 lines of code.
Performed data cleaning on the dataset according to self-designed rules, and removed the abnormal data with an significant negative impact on model performance.
Designed a siamese light-BERT with 4-layers to avoid up to 83% unnecessary computation cost for both training and inference comparing to the full-percision BERT model.
Training on fly: our model can achieve SOTA performance in less than 5 hours using a single V100 GPU card.
Using a Local-Linear-Embedding-like method to increase the [email protected] by ~1%.
Scores are stable (nearly the same) on the validation, testA and testB.

Our Pipeline

Open the jupyter notebook jupyter lab or jupyter notebook
Clean the dataset: Open 01PRE.ipynb, change the variable 'filename' and run all cells. In this notebook, we perform base64 decoding process and using BERT tokenizer to covert the queryies into token_id lists. We also convert the data type into float16 to further reduce both disk and memory usage.
Model Training: Open one of 02MODEL.ipynb and run all cells. In this notebook, we remove the sample if its bounding box number is higher than 40, which is the also the maximum bounding box number of the test set. Then we use a siamese light-BERT with 4-layers to learning every (query, image) pairs sampled uniformly from the training set. A off-the-shelf framework - pytorch-lightning is used.
Inference: Open 03INFER.ipynb. In this notebook, we rewrite the InferSet class to implement our local-linear-embedding-like method. By the way, the inference code is almost the same as validation code so we think it is unnecessay to provide our messy code :).

Local-Linear-Embedding-like Method

As mentioned in the step4, we adopted the local-linear-embedding-like method to further enhance the feature.

Given a ROI, we find the top3 most similar ROIs using KNN (K-nearest neighbour) method, then we summed them by weight 0.7, 0.2, 0.1 for keeping the same input numerical scale.

Members

This is a solo team which consists of:

Chengxuan Ying, Dalian University of Technology (应承轩大连理工大学)

Acknowledgment

Thanks for Weiwei Xu, who provided a 4-GPU server.

Links to Other Solutions

KDD_WinnieTheBest (Ranked 1st on the LB)
Ai-Light

Future Work

It is worthy to try the SOTA of image-text representation models, like UNITER[2] or OSCAR[3].

Reference

Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
Chen Y C, Li L, Yu L, et al. Uniter: Learning universal image-text representations[J]. arXiv preprint arXiv:1909.11740, 2019.
Li X, Yin X, Li C, et al. Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks[J]. arXiv preprint arXiv:2004.06165, 2020.

Seeking Opportunities

I will be graduated in the summer of 2021 from Dalian University of Technology. If you can refer me to any company, please contact me [email protected].

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 61

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗