spark-vcfSpark VCF data source implementation for Dataframes
Stars: ✭ 15 (-34.78%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+69.57%)
opaque-sqlAn encrypted data analytics platform
Stars: ✭ 169 (+634.78%)
albisAlbis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (-13.04%)
dt-sql-parserSQL Parsers for BigData, built with antlr4.
Stars: ✭ 135 (+486.96%)
Tweet-Analysis-With-Kafka-and-SparkA real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Stars: ✭ 18 (-21.74%)
geosparkbring sf to spark in production
Stars: ✭ 53 (+130.43%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+7382.61%)
RedashMake Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+87495.65%)
spark-data-sourcesDeveloping Spark External Data Sources using the V2 API
Stars: ✭ 36 (+56.52%)
recsys sparkSpark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤
Stars: ✭ 76 (+230.43%)
litemall-dw基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (+56.52%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+47.83%)
SparkApache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+139.13%)
wow-spark🔆 spark自学手册,包含了例如spark core、spark sql、spark streaming、spark-kafka、delta-lake,以及scala基础练习,还有一些例如master、shuffle源码分析,总结及翻译。
Stars: ✭ 20 (-13.04%)
NYC Taxi PipelineDesign/Implement stream/batch architecture on NYC taxi data | #DE
Stars: ✭ 16 (-30.43%)