sparkProjectTemplate
A Giter8 template for Scala Spark Projects.
What this gives you
This template will bootstrap a new spark project with everyone's "favourite" wordcount example (modified for stop words). You can then replace the wordcount example as desired, and customize the Spark components your project needs.
To encourage good software development practice, this starts with a project at 100% code coverage (e.g. one test :p), while its expected for this to decrease, we hope you use the provided spark-testing-base library or similar option.
Creating a new project from this template
Have g8 installed? You can run it with:
g8 holdenk/sparkProjectTemplate --name=projectname --organization=com.my.org --sparkVersion=2.2.0
Using sbt (0.13.13+) just do
sbt new holdenk/sparkProjectTemplate.g8
Executing the created project
First go to the project you created:
cd projectname
You can test locally the example spark job included in this template directly from sbt:
sbt "run inputFile.txt outputFile.txt"
then choose CountingLocalApp
when prompted.
You can also assemble a fat jar (see sbt-assembly for configuration details):
sbt assembly
then submit as usual to your spark cluster :
/path/to/spark-home/bin/spark-submit \
--class <package-name>.CountingApp \
--name the_awesome_app \
--master <master url> \
./target/scala-2.11/<jar name> \
<input file> <output file>
Related
Want to build your application using the Spark Job Server? The spark-jobserver.g8 template can help you get started too.
License
This project is available under your choice of Apache 2 or CC0 1.0. See https://www.apache.org/licenses/LICENSE-2.0 or https://creativecommons.org/publicdomain/zero/1.0/ respectively. This template is distributed without any warranty.