All Projects → apache → Spark Website

apache / Spark Website

Licence: apache-2.0
Apache Spark Website

Programming Languages

python
139335 projects - #7 most used programming language
java
68154 projects - #9 most used programming language
scala
5932 projects
r
7636 projects

Projects that are alternatives of or similar to Spark Website

Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+42057.33%)
Mutual labels:  sql, spark, big-data, jdbc
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+188%)
Mutual labels:  spark, big-data, jdbc
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+100%)
Mutual labels:  sql, spark, big-data
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+381.33%)
Mutual labels:  sql, spark, big-data
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+2997.33%)
Mutual labels:  sql, spark, jdbc
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+6008%)
Mutual labels:  sql, big-data, jdbc
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+384%)
Mutual labels:  sql, spark, jdbc
Mycat2
MySQL Proxy using Java NIO based on Sharding SQL,Calcite ,simple and fast
Stars: ✭ 750 (+900%)
Mutual labels:  sql, jdbc
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-78.67%)
Mutual labels:  sql, spark
Ebean
Ebean ORM
Stars: ✭ 1,172 (+1462.67%)
Mutual labels:  sql, jdbc
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-5.33%)
Mutual labels:  spark, big-data
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+893.33%)
Mutual labels:  spark, big-data
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+828%)
Mutual labels:  sql, spark
Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-85.33%)
Mutual labels:  spark, big-data
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+7441.33%)
Mutual labels:  spark, big-data
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+714.67%)
Mutual labels:  sql, spark
Phoenix
Mirror of Apache Phoenix
Stars: ✭ 867 (+1056%)
Mutual labels:  sql, big-data
Examples
Demo applications and code examples for Confluent Platform and Apache Kafka
Stars: ✭ 571 (+661.33%)
Mutual labels:  sql, jdbc
Kamu Cli
Next generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-8%)
Mutual labels:  sql, spark
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-24%)
Mutual labels:  spark, big-data

Generating the website HTML

In this directory you will find text files formatted using Markdown, with an .md suffix.

Building the site requires Jekyll Rouge. The easiest way to install the right version of these tools is using the Bundler and running bundle install in this directory.

See also https://github.com/apache/spark/blob/master/docs/README.md

A site build will update the directories and files in the site directory with the generated files. Using Jekyll via bundle exec jekyll locks it to the right version. So after this you can generate the html website by running bundle exec jekyll build in this directory. Use the --watch flag to have jekyll recompile your files as you save changes.

In addition to generating the site as HTML from the markdown files, jekyll can serve the site via a web server. To build the site and run a web server use the command bundle exec jekyll serve which runs the web server on port 4000, then visit the site at http://localhost:4000.

Please make sure you always run bundle exec jekyll build after testing your changes with bundle exec jekyll serve, otherwise you end up with broken links in a few places.

Updating Jekyll version

To update Jekyll or any other gem please follow these steps:

  1. Update the version in the Gemfile
  2. Run bundle update which updates the Gemfile.lock
  3. Commit both files

Docs sub-dir

The docs are not generated as part of the website. They are built separately for each release of Spark from the Spark source repository and then copied to the website under the docs directory. See the instructions for building those in the readme in the Spark project's /docs directory.

Rouge and Pygments

We also use rouge for syntax highlighting in documentation markdown pages. Its HTML output is compatible with CSS files designed for Pygments.

To mark a block of code in your markdown to be syntax highlighted by jekyll during the compile phase, use the following syntax:

{% highlight scala %}
// Your scala code goes here, you can replace scala with many other
// supported languages too.
{% endhighlight %}

You probably don't need to install that unless you want to regenerate the Pygments CSS file. It requires Python, and can be installed by running sudo easy_install Pygments.

Merge PR

To merge pull request, use the merge_pr.py script which also squashes the commits.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].