wzhe06 / Sparkctr
Licence: apache-2.0
CTR prediction model based on spark(LR, GBDT, DNN)
Stars: ✭ 740
Programming Languages
scala
5932 projects
Labels
Projects that are alternatives of or similar to Sparkctr
Lopq
Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (-28.38%)
Mutual labels: spark
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+664.32%)
Mutual labels: spark
Elasticsearch Spark Recommender
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Stars: ✭ 707 (-4.46%)
Mutual labels: spark
Spark Daria
Essential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (-25.27%)
Mutual labels: spark
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+645%)
Mutual labels: spark
Freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (-15.27%)
Mutual labels: spark
Cdap
An open source framework for building data analytic applications.
Stars: ✭ 509 (-31.22%)
Mutual labels: spark
Kafka Storm Starter
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (-1.62%)
Mutual labels: spark
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (-17.43%)
Mutual labels: spark
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (-5.95%)
Mutual labels: spark
Sparklearning
Learning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (-24.59%)
Mutual labels: spark
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (-14.46%)
Mutual labels: spark
Justenoughscalaforspark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (-27.3%)
Mutual labels: spark
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (-30.68%)
Mutual labels: spark
Dev Setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+655.41%)
Mutual labels: spark
CTRmodel
CTR prediction model based on pure Spark MLlib, no third-party library.
Realized Models
- Naive Bayes
- Logistic Regression
- Factorization Machine
- Random Forest
- Gradient Boosted Decision Tree
- GBDT + LR
- Neural Network
- Inner Product Neural Network (IPNN)
- Outer Product Neural Network (OPNN)
Usage
It's a maven project. Spark version is 2.3.0. Scala version is 2.11.
After dependencies are imported by maven automatically, you can simple run the example function (com.ggstar.example.ModelSelection) to train all the CTR models and get the metrics comparison among all the models.
Related Papers on CTR prediction
-
[LR] Predicting Clicks - Estimating the Click-Through Rate for New Ads (Microsoft 2007)
-
[FFM] Field-aware Factorization Machines for CTR Prediction (Criteo 2016)
-
[GBDT+LR] Practical Lessons from Predicting Clicks on Ads at Facebook (Facebook 2014)
-
[PS-PLM] Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction (Alibaba 2017)
-
[FTRL] Ad Click Prediction a View from the Trenches (Google 2013)
-
[FM] Fast Context-aware Recommendations with Factorization Machines (UKON 2011)
-
[DCN] Deep & Cross Network for Ad Click Predictions (Stanford 2017)
-
[Deep Crossing] Deep Crossing - Web-Scale Modeling without Manually Crafted Combinatorial Features (Microsoft 2016)
-
[PNN] Product-based Neural Networks for User Response Prediction (SJTU 2016)
-
[DIN] Deep Interest Network for Click-Through Rate Prediction (Alibaba 2018)
-
[ESMM] Entire Space Multi-Task Model - An Effective Approach for Estimating Post-Click Conversion Rate (Alibaba 2018)
-
[Wide & Deep] Wide & Deep Learning for Recommender Systems (Google 2016)
-
[xDeepFM] xDeepFM - Combining Explicit and Implicit Feature Interactions for Recommender Systems (USTC 2018)
-
[Image CTR] Image Matters - Visually modeling user behaviors using Advanced Model Server (Alibaba 2018)
-
[AFM] Attentional Factorization Machines - Learning the Weight of Feature Interactions via Attention Networks (ZJU 2017)
-
[DIEN] Deep Interest Evolution Network for Click-Through Rate Prediction (Alibaba 2019)
-
[DSSM] Learning Deep Structured Semantic Models for Web Search using Clickthrough Data (UIUC 2013)
-
[FNN] Deep Learning over Multi-field Categorical Data (UCL 2016)
-
[DeepFM] A Factorization-Machine based Neural Network for CTR Prediction (HIT-Huawei 2017)
-
[NFM] Neural Factorization Machines for Sparse Predictive Analytics (NUS 2017)
Other Resources
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].