Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 286 (+694.44%)

Mutual labels: spark

Spark-Ar

Resources for Spark AR

Stars: ✭ 43 (+19.44%)

Mutual labels: spark

Casper

A compiler for automatically re-targeting sequential Java code to Apache Spark.

Stars: ✭ 45 (+25%)

Mutual labels: spark

recsys spark

Spark SQL 实现 ItemCF，UserCF，Swing，推荐系统，推荐算法，协同过滤

Stars: ✭ 76 (+111.11%)

Mutual labels: spark-sql

spark-stringmetric

Spark functions to run popular phonetic and string matching algorithms

Stars: ✭ 51 (+41.67%)

Mutual labels: spark

yuzhouwan

Code Library for My Blog

Stars: ✭ 39 (+8.33%)

Mutual labels: spark

incubator-linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,459 (+6730.56%)

Mutual labels: spark

data processing course

Some class materials for a data processing course using PySpark

Stars: ✭ 50 (+38.89%)

Mutual labels: spark

spark-extension

A library that provides useful extensions to Apache Spark and PySpark.

Stars: ✭ 25 (-30.56%)

Mutual labels: spark

swordfish

Open-source distribute workflow schedule tools, also support streaming task.

Stars: ✭ 35 (-2.78%)

Mutual labels: spark

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-63.89%)

Mutual labels: spark

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-44.44%)

Mutual labels: spark

blog

blog entries

Stars: ✭ 39 (+8.33%)

Mutual labels: spark

sentry-spark

Apache Spark Sentry Integration

Stars: ✭ 14 (-61.11%)

Mutual labels: spark

litemall-dw

基于开源Litemall电商项目的大数据项目，包含前端埋点(openresty+lua)、后端埋点；数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化)，同时也包含了Azkaban的workflow。

Stars: ✭ 36 (+0%)

Mutual labels: spark-sql

SparkProgrammingInScala

Apache Spark Course Material

Stars: ✭ 57 (+58.33%)

Mutual labels: spark-sql

Spark

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .

Stars: ✭ 55 (+52.78%)

Mutual labels: spark-sql

visions

Type System for Data Analysis in Python

Stars: ✭ 136 (+277.78%)

Mutual labels: spark

spark-acid

ACID Data Source for Apache Spark based on Hive ACID

Stars: ✭ 91 (+152.78%)

Mutual labels: spark

Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

Stars: ✭ 47 (+30.56%)

Mutual labels: spark-sql

Search Ads Web Service

Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]

Stars: ✭ 30 (-16.67%)

Mutual labels: spark

tpch-spark

TPC-H queries in Apache Spark SQL using native DataFrames API

Stars: ✭ 63 (+75%)

Mutual labels: spark

spark-gradle-template

Apache Spark in your IDE with gradle

Stars: ✭ 39 (+8.33%)

Mutual labels: spark

trembita

Model complex data transformation pipelines easily

Stars: ✭ 44 (+22.22%)

Mutual labels: spark

openverse-catalog

Identifies and collects data on cc-licensed content across web crawl data and public apis.

Stars: ✭ 27 (-25%)

Mutual labels: spark

frovedis

Framework of vectorized and distributed data analytics

Stars: ✭ 59 (+63.89%)

Mutual labels: spark

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (+163.89%)

Mutual labels: spark

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (+80.56%)

Mutual labels: spark

ODSC India 2018

My presentation at ODSC India 2018 about Deep Learning with Apache Spark

Stars: ✭ 26 (-27.78%)

Mutual labels: spark

BigData-News

基于Spark2.2新闻网大数据实时系统项目

Stars: ✭ 36 (+0%)

Mutual labels: spark

sparkar-volts

An extensive non-reactive Typescript framework that eases the development experience in Spark AR

Stars: ✭ 15 (-58.33%)

Mutual labels: spark

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-61.11%)

Mutual labels: spark

experiments

Code examples for my blog posts

Stars: ✭ 21 (-41.67%)

Mutual labels: spark

kafka-compose

🎼 Docker compose files for various kafka stacks

Stars: ✭ 32 (-11.11%)

Mutual labels: spark

splink

Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters

Stars: ✭ 181 (+402.78%)

Mutual labels: spark

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-55.56%)

Mutual labels: spark

visualize-data-with-python

A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.

Stars: ✭ 60 (+66.67%)

Mutual labels: spark

docker-spark

Apache Spark docker container image (Standalone mode)

Stars: ✭ 34 (-5.56%)

Mutual labels: spark

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-5.56%)

Mutual labels: spark-sql

smolder

HL7 Apache Spark Datasource

Stars: ✭ 33 (-8.33%)

Mutual labels: spark

spark-sql-internals

The Internals of Spark SQL

Stars: ✭ 331 (+819.44%)

Mutual labels: spark-sql

Python Master Courses

人生苦短我用Python

Stars: ✭ 61 (+69.44%)

Mutual labels: spark

Real-time-Data-Warehouse

Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi

Stars: ✭ 52 (+44.44%)

Mutual labels: spark-sql

SparkV

🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.

Stars: ✭ 24 (-33.33%)

Mutual labels: spark

spark-sql-flow-plugin

Visualize column-level data lineage in Spark SQL

Stars: ✭ 20 (-44.44%)

Mutual labels: spark

wow-spark

🔆 spark自学手册，包含了例如spark core、spark sql、spark streaming、spark-kafka、delta-lake，以及scala基础练习，还有一些例如master、shuﬄe源码分析，总结及翻译。

Stars: ✭ 20 (-44.44%)

Mutual labels: spark-sql

spark2-etl-examples

A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0

Stars: ✭ 23 (-36.11%)

Mutual labels: spark-sql

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Stars: ✭ 19 (-47.22%)

Mutual labels: spark-sql

spark-demos

Collection of different demo applications using Apache Spark

Stars: ✭ 15 (-58.33%)

Mutual labels: spark

spark-word2vec

A parallel implementation of word2vec based on Spark

Stars: ✭ 24 (-33.33%)

Mutual labels: spark

spark-vcf

Spark VCF data source implementation for Dataframes

Stars: ✭ 15 (-58.33%)

Mutual labels: spark-sql

arcgis-experience-builder-sdk-resources

ArcGIS Experience Builder samples

Stars: ✭ 47 (+30.56%)

Mutual labels: data-sources

spark-kubernetes

spark on kubernetes

Stars: ✭ 80 (+122.22%)

Mutual labels: spark

prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (+50%)

Mutual labels: spark

1-60 of 419 similar projects

›

next*5