All Projects → leobenkel → Zparkio

leobenkel / Zparkio

Licence: mit
Boiler plate framework to use Spark and ZIO together.

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to Zparkio

Handlebars Helpers
Related projects
Stars: ✭ 2,024 (+1572.73%)
Mutual labels:  helpers, template
Freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+418.18%)
Mutual labels:  spark, functional-programming
Frameless
Expressive types for Spark.
Stars: ✭ 717 (+492.56%)
Mutual labels:  spark, functional-programming
P5 Text Xslate
Scalable template engine for Perl5
Stars: ✭ 117 (-3.31%)
Mutual labels:  template
Dnjs
DOM Notation JS
Stars: ✭ 118 (-2.48%)
Mutual labels:  template
Jeayeson
A very sane (header only) C++14 JSON library
Stars: ✭ 119 (-1.65%)
Mutual labels:  template
Example Spark Kafka
Apache Spark and Apache Kafka integration example
Stars: ✭ 120 (-0.83%)
Mutual labels:  spark
Aardvark.base
Aardvark is an open-source platform for visual computing, real-time graphics and visualization. This repository is the basis for most platform libraries and provides basic functionality such as data-structures, math and much more.
Stars: ✭ 117 (-3.31%)
Mutual labels:  functional-programming
Teddy
Spark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-0.83%)
Mutual labels:  spark
Test State
Scala Test-State.
Stars: ✭ 119 (-1.65%)
Mutual labels:  functional-programming
Terraform Provider Template
Terraform template provider
Stars: ✭ 119 (-1.65%)
Mutual labels:  template
Webpacktemplate
webpack多页面脚手架
Stars: ✭ 118 (-2.48%)
Mutual labels:  template
Dev Folio
🔥 A collection of Free Portfolio templates for developers.
Stars: ✭ 120 (-0.83%)
Mutual labels:  template
Ivy
The templated deep learning framework, enabling framework-agnostic functions, layers and libraries.
Stars: ✭ 118 (-2.48%)
Mutual labels:  template
Template
A super-simple way to create new projects based on templates.
Stars: ✭ 120 (-0.83%)
Mutual labels:  template
Teacup
Teacup is templates in CoffeeScript
Stars: ✭ 117 (-3.31%)
Mutual labels:  template
Kinesis Sql
Kinesis Connector for Structured Streaming
Stars: ✭ 120 (-0.83%)
Mutual labels:  spark
Neo Ico Template
An ICO Template for NEO projects
Stars: ✭ 119 (-1.65%)
Mutual labels:  template
Compose State
Compose multiple setState or getDerivedStateFromProps updaters in React
Stars: ✭ 119 (-1.65%)
Mutual labels:  functional-programming
Opengl cmake skeleton
❤️ A ready to use cmake skeleton using GLFW, Glew and glm. 👍
Stars: ✭ 118 (-2.48%)
Mutual labels:  template

ZparkIO logo

License: MIT Gitter release-badge maven-central-badge CI BCH compliance Coverage Status Mutation testing badge

ZparkIO

Boiler plate framework to use Spark and ZIO together.

The goal of this framework is to blend Spark and ZIO in an easy to use system for data engineers.

Allowing them to use Spark in a new, faster, more reliable way, leveraging ZIO power.

Table of Contents

Created by gh-md-toc

What is this library for ?

This library will implement all the boiler plate for you to be able to include Spark and ZIO in your ML project.

It can be tricky to use ZIO to save an instance of Spark to reuse in your code and this library solve all the boilerplate problem for you.

More About ZparkIO

Public Presentation

Feel free to look at the slides on Google Drive or on SlideShare presented during the ScalaSF meetup on Thursday, March 26, 2020. You can also watch the presentation on Youtube.

ZparkIO was on version 0.7.0, so things might be out of date.

Migrate your Spark Project to ZparkIO

Migrate from Plain Spark to ZparkIO

Why would you want to use ZIO and Spark together?

From my experience, using ZIO/Future in combination with Spark can speed up drastically the performance of your job. The reason being that sources (BigQuery, Postgresql, S3 files, etc...) can be fetch in parallel while the computation are not on hold. Obviously ZIO is much better than Future but it is harder to set up. Not anymore!

Some other nice aspect of ZIO is the error/exception handling as well as the build-in retry helpers. Which make retrying failed task a breath within Spark.

How to use?

I hope that you are now convinced that ZIO and Spark are a perfect match. Let's see how to use this Zparkio.

One of the easiest way to use ZparkIO is to use the giter8 template project:

sbt new leobenkel/zparkio.g8

Include dependencies

First include the library in your project:

libraryDependencies += "com.leobenkel" %% "zparkio" % "[SPARK_VERSION]_[VERSION]"

With version being: maven-central-badge release-badge.

To checkout out the Spark Versions and the Version.

This library depends on Spark, ZIO and Scallop.

Unit-test

You can also add

libraryDependencies += "com.leobenkel" %% "zparkio-test" % "[VERSION]"

With version being: maven-central-badge-test .

To get access to helper function to help you write unit tests.

How to use in your code?

There is a project example you can look at. But here are the details.

Main

The first thing you have to do is extends the ZparkioApp trait. For an example you can look at the ProjectExample: Application.

Spark

By using this architecture, you will have access to SparkSesion anywhere in your ZIO code, via

import com.leobenkel.zparkio.Services._

for {
  spark <- SparkModule()
} yield {
  ???
}

for instance you can see its use here.

Command lines

You will also have access to all your command lines automatically parsed, generated and accessible to you via:

CommandLineArguments ; it is recommended to make this helper function to make the rest of your code easier to use.

Then using it, like here, is easy.

Helpers

In the implicits object, that you can include everywhere. You are getting specific helper functions to help streamline your projects.

Unit test

Using this architecture will literally allow you to run your main as a unit test.

Examples

Simple example

Take a look at the simple project example to see example of working code using this library: SimpleProject.

More complex architecture

A full-fledged, production-ready project will obviously need more code than the simple example. For this purpose, and upon suggestion of several awesome people, I added a more complex project. This is a WIP and more will be added as I go. MoreComplexProject.

Authors

Leo Benkel

  • leobenkel-github-badge
  • leobenkel-linkedin-badge
  • leobenkel-personal-badge
  • leobenkel-patreon-badge
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].