visionsType System for Data Analysis in Python
Stars: ✭ 136 (-54.52%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-95.65%)
dllibdllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-89.3%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-91.64%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+722.41%)
Sk DistDistributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (-13.04%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-81.94%)
CasperA compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (-84.95%)
Docker Spark ClusterA simple spark standalone cluster for your testing environment purposses
Stars: ✭ 261 (-12.71%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-62.88%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-91.64%)
Spark-PMoFSpark Shuffle Optimization with RDMA+AEP
Stars: ✭ 28 (-90.64%)
CloudflowCloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (-7.02%)
docker-sparkApache Spark docker container image (Standalone mode)
Stars: ✭ 34 (-88.63%)
SuccinctEnabling queries on compressed data.
Stars: ✭ 257 (-14.05%)
Search Ads Web ServiceOnline search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (-89.97%)
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-93.98%)
SparkV🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-91.97%)
spark-utillow-level helpers for Apache Spark libraries and tests
Stars: ✭ 16 (-94.65%)
trembitaModel complex data transformation pipelines easily
Stars: ✭ 44 (-85.28%)
HelkThe Hunting ELK
Stars: ✭ 3,097 (+935.79%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-95.32%)
Book本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-84.28%)
smolderHL7 Apache Spark Datasource
Stars: ✭ 33 (-88.96%)
Spark Druid OlapSparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 282 (-5.69%)
spark-demosCollection of different demo applications using Apache Spark
Stars: ✭ 15 (-94.98%)
spark-http-streamspark structured streaming via HTTP communication
Stars: ✭ 17 (-94.31%)
tpch-sparkTPC-H queries in Apache Spark SQL using native DataFrames API
Stars: ✭ 63 (-78.93%)
frovedisFramework of vectorized and distributed data analytics
Stars: ✭ 59 (-80.27%)
daf-kyloKylo integration with PDND (previously DAF).
Stars: ✭ 20 (-93.31%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-89.3%)
sentry-sparkApache Spark Sentry Integration
Stars: ✭ 14 (-95.32%)
Spark Jupyter AwsA guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-13.38%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-69.57%)
spark-data-sourcesDeveloping Spark External Data Sources using the V2 API
Stars: ✭ 36 (-87.96%)
spark-word2vecA parallel implementation of word2vec based on Spark
Stars: ✭ 24 (-91.97%)
Hbase RddSpark RDD to read, write and delete from HBase
Stars: ✭ 277 (-7.36%)
shamashAutoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-89.63%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-94.65%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (-86.96%)
Big Data Rosetta CodeCode snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-15.05%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-78.26%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-0.33%)
Spark NotebookInteractive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+930.43%)
DatavecETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (-9.03%)
blogblog entries
Stars: ✭ 39 (-86.96%)