a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.

Stars: ✭ 25 (+13.64%)

Mutual labels: spark-streaming

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (+881.82%)

Mutual labels: spark-streaming

cassandra.realtime

Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink

Stars: ✭ 25 (+13.64%)

Mutual labels: spark-streaming

T-Watch

Real Time Twitter Sentiment Analysis Product

Stars: ✭ 20 (-9.09%)

Mutual labels: spark-streaming

BigInsights-on-Apache-Hadoop

Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix

Stars: ✭ 21 (-4.55%)

Mutual labels: spark-streaming

Real-time-log-analysis-system

🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)

Stars: ✭ 31 (+40.91%)

Mutual labels: spark-streaming

kafka-twitter-spark-streaming

Counting Tweets Per User in Real-Time

Stars: ✭ 38 (+72.73%)

Mutual labels: spark-streaming

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-18.18%)

Mutual labels: spark-streaming

ExDeMon

A general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...

Stars: ✭ 19 (-13.64%)

Mutual labels: spark-streaming

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-13.64%)

Mutual labels: spark-streaming

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+1022.73%)

Mutual labels: spark-streaming

seatunnel-example

seatunnel plugin developing examples.

Stars: ✭ 27 (+22.73%)

Mutual labels: spark-streaming

spark-utils

Basic framework utilities to quickly start writing production ready Apache Spark applications

Stars: ✭ 25 (+13.64%)

Mutual labels: spark-streaming

architect big data solutions with spark

code, labs and lectures for the course

Stars: ✭ 40 (+81.82%)

Mutual labels: spark-streaming

bitnami-docker-spark

Bitnami Docker Image for Apache Spark

Stars: ✭ 239 (+986.36%)

Mutual labels: spark-streaming

View All Similar Projects ➔

AdRealTimeAnalysis

四川大学拓思艾诺广告流量实时分析项目

需求

实现实时的动态黑名单机制，将每天对某个广告点击超过100次的用户拉黑
基于黑名单的非法广告点击流量过滤
统计每天各省各城市各广告的点击流量实时统计(基于需求二)
统计每天各省的top3热门广告(基于需求二)
统计各个广告最近一个小时内的点击趋势：各个广告最近1小时内各分钟的点击量(基于需求二)
实时计算每天各省城市各广告的点击量(基于需求二)，更新到MySQL

实现思路

实时计算各batch中的每天各用户对各广告的点击次数
使用高性能方式将每天各用户对各广告的点击次数写入MySQL中（更新）
使用filter过滤出每天对某个广告点击超过100次的黑名单用户，并写入MySQL中
使用transform操作，对每个batch RDD进行处理，都动态加载MySQL中的黑名单生成RDD，然后进行join后，过滤掉batch RDD中的黑名单用户的广告点击行为
使用updateStateByKey操作，实时计算每天各省各城市各广告的点击量，并时候更新到MySQL
使用transform结合Spark SQL，统计每天各省份top3热门广告：首先以每天各省各城市各广告的点击量数据作为基础，首先统计出每天各省份各广告的点击量；然后启动一个异步子线程，使用Spark SQL动态将数据RDD转换为DataFrame后，注册为临时表；最后使用Spark SQL开窗函数，统计出各省份top3热门的广告，并更新到MySQL中

其它

以下链接是一个从前端展示到后台数据交互流程的具体demo WiFiProbeAnalysis

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

wanghan0501 / AdRealTimeAnalysis

Programming Languages

Labels

Projects that are alternatives of or similar to AdRealTimeAnalysis

AdRealTimeAnalysis

需求

实现思路

其它