All Projects → Sparktutorial → Similar Projects or Alternatives

529 Open source projects that are alternatives of or similar to Sparktutorial

experiments
Code examples for my blog posts
Stars: ✭ 21 (-80%)
Mutual labels:  spark
Freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+497.14%)
Mutual labels:  spark
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+72.38%)
Mutual labels:  spark
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-24.76%)
Mutual labels:  spark
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-42.86%)
Mutual labels:  spark
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+5286.67%)
Mutual labels:  spark
v6.dooring.public
可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.
Stars: ✭ 323 (+207.62%)
Mutual labels:  bigdata
Play Spark Scala
Stars: ✭ 51 (-51.43%)
Mutual labels:  spark
datasphere-service
an open source dataworks platform
Stars: ✭ 20 (-80.95%)
Mutual labels:  bigdata
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+5150.48%)
Mutual labels:  spark
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Stars: ✭ 21 (-80%)
Mutual labels:  bigdata
bqv
The simplest tool to manage views of BigQuery.
Stars: ✭ 22 (-79.05%)
Mutual labels:  bigdata
Alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+5022.86%)
Mutual labels:  spark
vulkn
Love your Data. Love the Environment. Love VULKИ.
Stars: ✭ 43 (-59.05%)
Mutual labels:  bigdata
Apache Spark Internals
The Internals of Apache Spark
Stars: ✭ 1,045 (+895.24%)
Mutual labels:  spark
BigDataTools
tools for bigData
Stars: ✭ 36 (-65.71%)
Mutual labels:  bigdata
Sparklearning
Learning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (+431.43%)
Mutual labels:  spark
UnROOT.jl
Native Julia I/O package to work with CERN ROOT files
Stars: ✭ 52 (-50.48%)
Mutual labels:  bigdata
Home
ApacheCN 开源组织:公告、介绍、成员、活动、交流方式
Stars: ✭ 1,199 (+1041.9%)
Mutual labels:  spark
cds
Data syncing in golang for ClickHouse.
Stars: ✭ 839 (+699.05%)
Mutual labels:  bigdata
Justenoughscalaforspark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+412.38%)
Mutual labels:  spark
meetups-archivos
Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (-42.86%)
Mutual labels:  bigdata
Spark As Service Using Embedded Server
This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server
Stars: ✭ 46 (-56.19%)
Mutual labels:  spark
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-46.67%)
Mutual labels:  bigdata
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+388.57%)
Mutual labels:  spark
learning-spark
Tidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-73.33%)
Mutual labels:  bigdata
Biglasso
biglasso: Extending Lasso Model Fitting to Big Data in R
Stars: ✭ 87 (-17.14%)
Mutual labels:  bigdata
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (-73.33%)
Mutual labels:  bigdata
Magellan
Geo Spatial Data Analytics on Spark
Stars: ✭ 507 (+382.86%)
Mutual labels:  spark
Notes
This is a learning note | Java基础,JVM,源码,大数据,面经
Stars: ✭ 69 (-34.29%)
Mutual labels:  bigdata
Delta Architecture
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-59.05%)
Mutual labels:  spark
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-82.86%)
Mutual labels:  bigdata
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+357.14%)
Mutual labels:  spark
dockerfiles
Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-72.38%)
Mutual labels:  bigdata
Spark States
Custom state store providers for Apache Spark
Stars: ✭ 83 (-20.95%)
Mutual labels:  spark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-40%)
Mutual labels:  spark
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-78.1%)
Mutual labels:  spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-78.1%)
Mutual labels:  spark
the-apache-ignite-book
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-38.1%)
Mutual labels:  bigdata
Spark
Cross-platform real-time collaboration client optimized for business and organizations.
Stars: ✭ 471 (+348.57%)
Mutual labels:  spark
jhdf
A pure Java HDF5 library
Stars: ✭ 83 (-20.95%)
Mutual labels:  bigdata
Gatk
Official code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+854.29%)
Mutual labels:  spark
dt-sql-parser
SQL Parsers for BigData, built with antlr4.
Stars: ✭ 135 (+28.57%)
Mutual labels:  bigdata
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+334.29%)
Mutual labels:  spark
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (-0.95%)
Mutual labels:  bigdata
lectures-hse-spark
Масштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-80.95%)
Mutual labels:  bigdata
Tensorbase
TensorBase BE is building a high performance, cloud neutral bigdata warehouse for SMEs fully in Rust.
Stars: ✭ 440 (+319.05%)
Mutual labels:  bigdata
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+850.48%)
Mutual labels:  spark
chatnoir-resiliparse
A robust web archive analytics toolkit
Stars: ✭ 26 (-75.24%)
Mutual labels:  bigdata
PersonNotes
个人笔记集中营,快糙猛的形式记录技术性Notes .. 📚☕️⌨️🎧
Stars: ✭ 61 (-41.9%)
Mutual labels:  bigdata
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1038.1%)
Mutual labels:  spark
laravel-spark-camera
Profile Photo Camera support for Laravel Spark
Stars: ✭ 30 (-71.43%)
Mutual labels:  spark
Digitrecognizer
Java Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-78.1%)
Mutual labels:  spark
DetEdit
A graphical user interface for annotating and editing events detected in long-term acoustic monitoring data
Stars: ✭ 20 (-80.95%)
Mutual labels:  bigdata
jigsaw-seed
这是组件库 Jigsaw-七巧板(https://github.com/rdkmaster/jigsaw) 的种子工程,建议所有新增的app都以这个工程作为种子开始构建。
Stars: ✭ 17 (-83.81%)
Mutual labels:  bigdata
Spark Doc Zh
Apache Spark 官方文档中文版
Stars: ✭ 1,126 (+972.38%)
Mutual labels:  spark
Kylo
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+772.38%)
Mutual labels:  spark
sparkProjectTemplate.g8
Template for Spark Projects
Stars: ✭ 77 (-26.67%)
Mutual labels:  spark
Book
本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-55.24%)
Mutual labels:  spark
10 Weeks
10-weeks of technology exploration
Stars: ✭ 22 (-79.05%)
Mutual labels:  bigdata
301-360 of 529 similar projects