All Projects → dwestheide → Kontextfrei

dwestheide / Kontextfrei

Licence: apache-2.0
Writing application logic for Spark jobs that can be unit-tested without a SparkContext

Programming Languages

scala
5932 projects

Labels

Projects that are alternatives of or similar to Kontextfrei

Pulsar Spark
When Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-17.91%)
Mutual labels:  spark
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-10.45%)
Mutual labels:  spark
Pyspark Twitter Stream Mining
Real-time Machine Learning with Apache Spark on Twitter Public Stream
Stars: ✭ 64 (-4.48%)
Mutual labels:  spark
Awesome Pulsar
A curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-14.93%)
Mutual labels:  spark
Pyspark Examples
Code examples on Apache Spark using python
Stars: ✭ 58 (-13.43%)
Mutual labels:  spark
Silex
something to help you spark
Stars: ✭ 61 (-8.96%)
Mutual labels:  spark
Utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+1497.01%)
Mutual labels:  spark
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-2.99%)
Mutual labels:  spark
Zemberek Nlp Server
Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-10.45%)
Mutual labels:  spark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-5.97%)
Mutual labels:  spark
Model Serving Tutorial
Code and presentation for Strata Model Serving tutorial
Stars: ✭ 57 (-14.93%)
Mutual labels:  spark
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-13.43%)
Mutual labels:  spark
Roffildlibrary
Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS
Stars: ✭ 63 (-5.97%)
Mutual labels:  spark
Net.jgp.labs.spark
Apache Spark examples exclusively in Java
Stars: ✭ 55 (-17.91%)
Mutual labels:  spark
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-4.48%)
Mutual labels:  spark
Docker Hadoop
A Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-19.4%)
Mutual labels:  spark
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-10.45%)
Mutual labels:  spark
Thingsboard
Open-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+15610.45%)
Mutual labels:  spark
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-2.99%)
Mutual labels:  spark
Spark Doc Zh
Apache Spark 官方文档中文版
Stars: ✭ 1,126 (+1580.6%)
Mutual labels:  spark

kontextfrei

Build Status codecov Gitter chat

What is this?

This library enables you to write the business logic of your Spark application without depending on RDDs and the SparkContext.

Motivation

Why you would want to do that? Because firing up a SparkContext and running your unit tests in a local Spark cluster is really slow. kontextfrei frees you from this hard dependency on a SparkContext, ultimately leading to a much faster feedback cycle during development.

Documentation

Please visit the kontextfrei website to learn more about how to use this library.

For an example that showcases how the library can be used, please have a look at kontextfrei-example.

Usage

The library is split up into two modules:

  • kontextfrei-core: You definitely need this to use the library
  • kontextfrei-scalatest: Some optional goodies to make testing your application logic easier and remove some boilerplate; this comes with a transitive dependency on ScalaTest and ScalaCheck.

kontextfrei assumes that the Spark dependency is provided by your application, so have to explicitly add a dependency to Spark.

Currently, kontextfrei binary releases are built against Spark 1.4.1, 2.0.2, 2.1.3, 2.2.3 (each of them both for Scala 2.11 and 2.10), 2.3.2 (compiled for Scala 2.11), and 2.4.0 (compiled both for Scala 2.11 and 2.12).

Adding a dependency on the the current version of kontextfrei-core and kontextfrei-scalatest to your build.sbt looks like this:

Spark 1.4

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-1.4.1" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-1.4.1" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1" % "provided"

Spark 2.0

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.0.2" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.0.2" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.2" % "provided"

Spark 2.1

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.1.3" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.1.3" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.3" % "provided"

Spark 2.2

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.2.3" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.2.3" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.3" % "provided"

Spark 2.3

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.3.2" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.3.2" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.2" % "provided"

Spark 2.4

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.4.0" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.4.0" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.0" % "provided"

As you can see, you need the specify the Spark version against which the library is supposed to be built as part of the artifact name.

Status

This library is in an early stage and is not feature-complete: only a subset of the operations available on RDDs is supported so far.

Contributors

Shout out to Simon J. Scott for his valuable contributions to this project.

Contributions welcome

Any kind of contribution is welcome. Please see the Contribution Guidelines.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].