All Projects → dotnet → Spark

dotnet / Spark

Licence: mit
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Programming Languages

C#
18002 projects
scala
5932 projects
powershell
5483 projects
shell
77523 projects
python
139335 projects - #7 most used programming language
CMake
9771 projects

Projects that are alternatives of or similar to Spark

Azure Event Hubs Spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-91.87%)
Mutual labels:  azure, microsoft, spark, bigdata, apache-spark, spark-streaming, streaming
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+68.45%)
Mutual labels:  azure, microsoft, spark, apache-spark, databricks
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (-46.02%)
Mutual labels:  spark, bigdata, apache-spark, spark-streaming, streaming
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-85.65%)
Mutual labels:  azure, spark, apache-spark, spark-streaming, streaming
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (-70.19%)
Mutual labels:  spark, analytics, spark-streaming, streaming
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+1070.66%)
Mutual labels:  spark, analytics, databricks, spark-sql
Azure Event Hubs
☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (-86.46%)
Mutual labels:  azure, microsoft, spark, streaming
Coolplayspark
酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+92.79%)
Mutual labels:  spark, apache-spark, spark-streaming
Storage
💿 Storage abstractions with implementations for .NET/.NET Standard
Stars: ✭ 380 (-77.92%)
Mutual labels:  azure, dotnet-standard, dotnet-core
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (-76.76%)
Mutual labels:  spark, analytics, bigdata
Sparkle
Haskell on Apache Spark.
Stars: ✭ 419 (-75.65%)
Mutual labels:  spark, analytics, apache-spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-98.66%)
Mutual labels:  spark, apache-spark, spark-sql
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-99.24%)
Mutual labels:  spark, apache-spark, bigdata
SparkProgrammingInScala
Apache Spark Course Material
Stars: ✭ 57 (-96.69%)
Mutual labels:  apache-spark, bigdata, spark-sql
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+94.94%)
Mutual labels:  microsoft, apache-spark, databricks
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (-76%)
Mutual labels:  spark, analytics, apache-spark
Streaming Readings
Streaming System 相关的论文读物
Stars: ✭ 554 (-67.81%)
Mutual labels:  apache-spark, spark-streaming, streaming
Spark Streaming Monitoring With Lightning
Plot live-stats as graph from ApacheSpark application using Lightning-viz
Stars: ✭ 15 (-99.13%)
Mutual labels:  bigdata, apache-spark, spark-streaming
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-99.19%)
Mutual labels:  spark, analytics, apache-spark
Real Time Stream Processing Engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Stars: ✭ 37 (-97.85%)
Mutual labels:  spark, apache-spark, spark-streaming

NuGet Badge

Icon

.NET for Apache® Spark™

.NET for Apache Spark provides high performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data.

.NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer.

.NET for Apache Spark runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. It also runs on all major cloud providers including Azure HDInsight Spark, Amazon EMR Spark, AWS & Azure Databricks.

Note: We currently have a Spark Project Improvement Proposal JIRA at SPIP: .NET bindings for Apache Spark to work with the community towards getting .NET support by default into Apache Spark. We highly encourage you to participate in the discussion.

Table of Contents

Supported Apache Spark

Apache Spark .NET for Apache Spark
2.4* v2.0.0
3.0
3.1

*2.4.2 is not supported.

Releases

.NET for Apache Spark releases are available here and NuGet packages are available here.

Get Started

These instructions will show you how to run a .NET for Apache Spark app using .NET Core.

Build Status

Ubuntu icon Windows icon
Ubuntu Windows
Build Status

Building from Source

Building from source is very easy and the whole process (from cloning to being able to run your app) should take less than 15 minutes!

Instructions
Windows icon Windows
Ubuntu icon Ubuntu

Samples

There are two types of samples/apps in the .NET for Apache Spark repo:

  • Icon Getting Started - .NET for Apache Spark code focused on simple and minimalistic scenarios.

  • Icon End-End apps/scenarios - Real world examples of industry standard benchmarks, usecases and business applications implemented using .NET for Apache Spark.

We welcome contributions to both categories!

Analytics Scenario

Description

Scenarios

Dataframes and SparkSQL
Simple code snippets to help you get familiarized with the programmability experience of .NET for Apache Spark.
Basic     C#     F#   Getting started icon
Structured Streaming
Code snippets to show you how to utilize Apache Spark's Structured Streaming (2.3.1, 2.3.2, 2.4.1, Latest)
Word Count     C#    F#    Getting started icon
Windowed Word Count    C#    F#    Getting started icon
Word Count on data from Kafka    C#    F#     Getting started icon

TPC-H Queries

Code to show you how to author complex queries using .NET for Apache Spark.
TPC-H Functional     C#    End-to-end app icon
TPC-H SparkSQL     C#    End-to-end app icon

Contributing

We welcome contributions! Please review our contribution guide.

Inspiration and Special Thanks

This project would not have been possible without the outstanding work from the following communities:

  • Apache Spark: Unified Analytics Engine for Big Data, the underlying backend execution engine for .NET for Apache Spark
  • Mobius: C# and F# language binding and extensions to Apache Spark, a pre-cursor project to .NET for Apache Spark from the same Microsoft group.
  • PySpark: Python bindings for Apache Spark, one of the implementations .NET for Apache Spark derives inspiration from.
  • sparkR: one of the implementations .NET for Apache Spark derives inspiration from.
  • Apache Arrow: A cross-language development platform for in-memory data. This library provides .NET for Apache Spark with efficient ways to transfer column major data between the JVM and .NET CLR.
  • Pyrolite - Java and .NET interface to Python's pickle and Pyro protocols. This library provides .NET for Apache Spark with efficient ways to transfer row major data between the JVM and .NET CLR.
  • Databricks: Unified analytics platform. Many thanks to all the suggestions from them towards making .NET for Apache Spark run on Azure and AWS Databricks.

How to Engage, Contribute and Provide Feedback

The .NET for Apache Spark team encourages contributions, both issues and PRs. The first step is finding an existing issue you want to contribute to or if you cannot find any, open an issue.

.NET Foundation

The .NET for Apache Spark project is part of the .NET Foundation.

Code of Conduct

This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community. For more information, see the .NET Foundation Code of Conduct.

License

.NET for Apache Spark is licensed under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].