Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-13.43%)

Mutual labels: spark

Roffildlibrary

Library for MQL5 (MetaTrader) with Python, Java, Apache Spark, AWS

Stars: ✭ 63 (-5.97%)

Mutual labels: spark

Net.jgp.labs.spark

Apache Spark examples exclusively in Java

Stars: ✭ 55 (-17.91%)

Mutual labels: spark

W2v

Word2Vec models with Twitter data using Spark. Blog:

Stars: ✭ 64 (-4.48%)

Mutual labels: spark

Docker Hadoop

A Docker container with a full Hadoop cluster setup with Spark and Zeppelin

Stars: ✭ 54 (-19.4%)

Mutual labels: spark

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-10.45%)

Mutual labels: spark

Thingsboard

Open-source IoT Platform - Device management, data collection, processing and visualization.

Stars: ✭ 10,526 (+15610.45%)

Mutual labels: spark

Spark Bigquery

Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.

Stars: ✭ 65 (-2.99%)

Mutual labels: spark

Spark Doc Zh

Apache Spark 官方文档中文版

Stars: ✭ 1,126 (+1580.6%)

Mutual labels: spark

View All Similar Projects ➔

kontextfrei

What is this?

This library enables you to write the business logic of your Spark application without depending on RDDs and the SparkContext.

Motivation

Why you would want to do that? Because firing up a SparkContext and running your unit tests in a local Spark cluster is really slow. kontextfrei frees you from this hard dependency on a SparkContext, ultimately leading to a much faster feedback cycle during development.

Documentation

Please visit the kontextfrei website to learn more about how to use this library.

For an example that showcases how the library can be used, please have a look at kontextfrei-example.

Usage

The library is split up into two modules:

kontextfrei-core: You definitely need this to use the library
kontextfrei-scalatest: Some optional goodies to make testing your application logic easier and remove some boilerplate; this comes with a transitive dependency on ScalaTest and ScalaCheck.

kontextfrei assumes that the Spark dependency is provided by your application, so have to explicitly add a dependency to Spark.

Currently, kontextfrei binary releases are built against Spark 1.4.1, 2.0.2, 2.1.3, 2.2.3 (each of them both for Scala 2.11 and 2.10), 2.3.2 (compiled for Scala 2.11), and 2.4.0 (compiled both for Scala 2.11 and 2.12).

Adding a dependency on the the current version of kontextfrei-core and kontextfrei-scalatest to your build.sbt looks like this:

Spark 1.4

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-1.4.1" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-1.4.1" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1" % "provided"

Spark 2.0

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.0.2" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.0.2" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.2" % "provided"

Spark 2.1

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.1.3" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.1.3" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.3" % "provided"

Spark 2.2

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.2.3" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.2.3" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.3" % "provided"

Spark 2.3

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.3.2" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.3.2" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.2" % "provided"

Spark 2.4

resolvers += "dwestheide" at "https://dl.bintray.com/dwestheide/maven"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-core-spark-2.4.0" % "0.8.0"
libraryDependencies += "com.danielwestheide" %% "kontextfrei-scalatest-spark-2.4.0" % "0.8.0" % "test,it"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.0" % "provided"

As you can see, you need the specify the Spark version against which the library is supposed to be built as part of the artifact name.

Status

This library is in an early stage and is not feature-complete: only a subset of the operations available on RDDs is supported so far.

Contributors

Shout out to Simon J. Scott for his valuable contributions to this project.

Contributions welcome

Any kind of contribution is welcome. Please see the Contribution Guidelines.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 67

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗