RedashMake Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+55863.89%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+4680.56%)
Spark-PMoFSpark Shuffle Optimization with RDMA+AEP
Stars: ✭ 28 (-22.22%)
spark-utillow-level helpers for Apache Spark libraries and tests
Stars: ✭ 16 (-55.56%)
spark-druid-olapSparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (+694.44%)
Spark-ArResources for Spark AR
Stars: ✭ 43 (+19.44%)
CasperA compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (+25%)
recsys sparkSpark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤
Stars: ✭ 76 (+111.11%)
spark-stringmetricSpark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (+41.67%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (+8.33%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+6730.56%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-30.56%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-2.78%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-63.89%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-44.44%)
blogblog entries
Stars: ✭ 39 (+8.33%)
sentry-sparkApache Spark Sentry Integration
Stars: ✭ 14 (-61.11%)
litemall-dw基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (+0%)
SparkApache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+52.78%)
visionsType System for Data Analysis in Python
Stars: ✭ 136 (+277.78%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (+152.78%)
Search Ads Web ServiceOnline search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (-16.67%)
tpch-sparkTPC-H queries in Apache Spark SQL using native DataFrames API
Stars: ✭ 63 (+75%)
trembitaModel complex data transformation pipelines easily
Stars: ✭ 44 (+22.22%)
openverse-catalogIdentifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-25%)
frovedisFramework of vectorized and distributed data analytics
Stars: ✭ 59 (+63.89%)
awesome-AI-kubernetes❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+163.89%)
Covid19TrackerA Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (+80.56%)
ODSC India 2018My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-27.78%)
sparkar-voltsAn extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-58.33%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-61.11%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-41.67%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-11.11%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+402.78%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-55.56%)
visualize-data-with-pythonA Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (+66.67%)
docker-sparkApache Spark docker container image (Standalone mode)
Stars: ✭ 34 (-5.56%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-5.56%)
smolderHL7 Apache Spark Datasource
Stars: ✭ 33 (-8.33%)
SparkV🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-33.33%)
wow-spark🔆 spark自学手册,包含了例如spark core、spark sql、spark streaming、spark-kafka、delta-lake,以及scala基础练习,还有一些例如master、shuffle源码分析,总结及翻译。
Stars: ✭ 20 (-44.44%)
spark2-etl-examplesA project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Stars: ✭ 23 (-36.11%)
spark-demosCollection of different demo applications using Apache Spark
Stars: ✭ 15 (-58.33%)
spark-word2vecA parallel implementation of word2vec based on Spark
Stars: ✭ 24 (-33.33%)
spark-vcfSpark VCF data source implementation for Dataframes
Stars: ✭ 15 (-58.33%)
prostoProsto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (+50%)