huangyueranbbc / Spark_ALS

Licence: MIT license

基于spark-ml,spark-mllib,spark-streaming的推荐算法实现

Programming Languages

java

68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Spark ALS

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (+2250.56%)

Mutual labels: spark-streaming, spark-mllib

bigdatatutorial

Stars: ✭ 34 (-61.8%)

Mutual labels: spark-streaming

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (+34.83%)

Mutual labels: spark-streaming

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+98.88%)

Mutual labels: spark-streaming

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+1833.71%)

Mutual labels: spark-streaming

Registry

Schema Registry

Stars: ✭ 184 (+106.74%)

Mutual labels: spark-streaming

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+1985.39%)

Mutual labels: spark-streaming

Real-time-log-analysis-system

🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)

Stars: ✭ 31 (-65.17%)

Mutual labels: spark-streaming

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+177.53%)

Mutual labels: spark-streaming

Scramjet

Simple yet powerful live data computation framework

Stars: ✭ 171 (+92.13%)

Mutual labels: spark-streaming

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (+57.3%)

Mutual labels: spark-streaming

Streamline

StreamLine - Streaming Analytics

Stars: ✭ 151 (+69.66%)

Mutual labels: spark-streaming

Example Spark

Spark, Spark Streaming and Spark SQL unit testing strategies

Stars: ✭ 205 (+130.34%)

Mutual labels: spark-streaming

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (+34.83%)

Mutual labels: spark-streaming

ExDeMon

A general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...

Stars: ✭ 19 (-78.65%)

Mutual labels: spark-streaming

Spark Mllib Twitter Sentiment Analysis

🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Stars: ✭ 113 (+26.97%)

Mutual labels: spark-streaming

Spark Streaming With Kafka

Self-contained examples of Apache Spark streaming integrated with Apache Kafka.

Stars: ✭ 180 (+102.25%)

Mutual labels: spark-streaming

Machine-Learning

Examples of all Machine Learning Algorithm in Apache Spark

Stars: ✭ 15 (-83.15%)

Mutual labels: spark-mllib

fdp-modelserver

An umbrella project for multiple implementations of model serving

Stars: ✭ 47 (-47.19%)

Mutual labels: spark-streaming

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (+142.7%)

Mutual labels: spark-streaming

View All Similar Projects ➔

Spark-ALS

简介

ALS是alternating least squares的缩写 , 意为交替最小二乘法；而ALS-WR是alternating-least-squares with weighted-λ -regularization的缩写，意为加权正则化交替最小二乘法。该方法常用于基于矩阵分解的推荐系统中。例如：将用户(user)对商品(item)的评分矩阵分解为两个矩阵：一个是用户对商品隐含特征的偏好矩阵，另一个是商品所包含的隐含特征的矩阵。在这个矩阵分解的过程中，评分缺失项得到了填充，也就是说我们可以基于这个填充的评分来给用户最商品推荐了。
ALS is the abbreviation of squares alternating least, meaning the alternating least squares method; and the ALS-WR is alternating-least-squares with weighted- lambda -regularization acronym, meaning weighted regularized alternating least squares method. This method is often used in recommender systems based on matrix factorization. For example, the user (user) score matrix of item is decomposed into two matrices: one is the user preference matrix for the implicit features of the commodity, and the other is the matrix of the implied features of the commodity. In the process of decomposing the matrix, the score missing is filled, that is, we can give the user the most recommended commodity based on the filled score.

ALS-WR算法，简单地说就是：
（数据格式为：userId, itemId, rating, timestamp ）
1 对每个userId随机初始化N（10）个factor值，由这些值影响userId的权重。
2 对每个itemId也随机初始化N（10）个factor值。
3 固定userId，从userFactors矩阵和rating矩阵中分解出itemFactors矩阵。即[Item Factors Matrix] = [User Factors Matrix]^-1 * [Rating Matrix].
4 固定itemId，从itemFactors矩阵和rating矩阵中分解出userFactors矩阵。即[User Factors Matrix] = [Item Factors Matrix]^-1 * [Rating Matrix].
5 重复迭代第3，第4步，最后可以收敛到稳定的userFactors和itemFactors。
6 对itemId进行推断就为userFactors * itemId = rating value；对userId进行推断就为itemFactors * userId = rating value。

#SparkALSByStreaming.java
基于Hadoop、Flume、Kafka、spark-streaming、logback、商城系统的实时推荐系统DEMO
Real time recommendation system DEMO based on Hadoop, Flume, Kafka, spark-streaming, logback and mall system
商城系统采集的数据集格式 Data Format:
用户ID，商品ID，用户行为评分，时间戳
UserID,ItemId,Rating,TimeStamp
53,1286513,9,1508221762
53,1172348420,9,1508221762
53,1179495514,12,1508221762
53,1184890730,3,1508221762
53,1210793742,159,1508221762
53,1215837445,9,1508221762

Kafka Command:

hadoop dfs -mkdir /spark-als/model

hadoop dfs -mkdir /flume/logs

kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic RECOMMEND_TOPIC

kafka-console-producer.sh --broker-list 192.168.0.193:9092 --topic RECOMMEND_TOPIC < /data/streaming_sample_movielens_ratings.txt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

huangyueranbbc / Spark_ALS

Programming Languages

Labels

Projects that are alternatives of or similar to Spark ALS

Spark-ALS