简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-92.9%)

Mutual labels: spark, bigdata

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-61.2%)

Mutual labels: spark, bigdata

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (-39.34%)

Mutual labels: spark, bigdata

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+840.44%)

Mutual labels: spark, bigdata

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-16.39%)

Mutual labels: spark

Bigdata practice

大数据分析可视化实践

Stars: ✭ 166 (-9.29%)

Mutual labels: bigdata

Powderkeg

Live-coding the cluster!

Stars: ✭ 152 (-16.94%)

Mutual labels: spark

Spark Tsne

Distributed t-SNE via Apache Spark

Stars: ✭ 151 (-17.49%)

Mutual labels: spark

Spark Nlp

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+1275.96%)

Mutual labels: spark

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-10.93%)

Mutual labels: spark

Spark Ml Source Analysis

spark ml 算法原理剖析以及具体的源码实现分析

Stars: ✭ 1,873 (+923.5%)

Mutual labels: spark

Benchm Ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).

Stars: ✭ 1,835 (+902.73%)

Mutual labels: spark

Whylogs Java

Profile and monitor your ML data pipeline end-to-end

Stars: ✭ 164 (-10.38%)

Mutual labels: spark

Hudi

Upserts, Deletes And Incremental Processing on Big Data.

Stars: ✭ 2,586 (+1313.11%)

Mutual labels: bigdata

Aztk

AZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure

Stars: ✭ 152 (-16.94%)

Mutual labels: spark

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (-3.83%)

Mutual labels: spark

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+6608.74%)

Mutual labels: spark

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (-12.02%)

Mutual labels: spark

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-18.03%)

Mutual labels: spark

Athenacli

AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.

Stars: ✭ 151 (-17.49%)

Mutual labels: bigdata

Linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,323 (+1169.4%)

Mutual labels: spark

Cc Pyspark

Process Common Crawl data with Python and Spark

Stars: ✭ 147 (-19.67%)

Mutual labels: spark

Avro

Apache Avro is a data serialization system.

Stars: ✭ 2,005 (+995.63%)

Mutual labels: bigdata

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+1038.8%)

Mutual labels: spark

Vue Info Card

Simple and beautiful card component with an elegant spark line, for VueJS.

Stars: ✭ 159 (-13.11%)

Mutual labels: spark

Pyspark Learning

Updated repository

Stars: ✭ 147 (-19.67%)

Mutual labels: spark

Datacompy

Pandas and Spark DataFrame comparison for humans