Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Azure → Azure Kusto Spark

Azure / Azure Kusto Spark

Licence: apache-2.0

Apache Spark Connector for Azure Kusto

Programming Languages

scala

5932 projects

Labels

azure spark

Projects that are alternatives of or similar to Azure Kusto Spark

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+517.5%)

Mutual labels: azure, spark

Seldon Server

Machine Learning Platform and Recommendation Engine built on Kubernetes

Stars: ✭ 1,435 (+3487.5%)

Mutual labels: azure, spark

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (+45%)

Mutual labels: azure, spark

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (+250%)

Mutual labels: azure, spark

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (+7147.5%)

Mutual labels: azure, spark

Azuredatabricksbestpractices

Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs

Stars: ✭ 186 (+365%)

Mutual labels: azure, spark

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+4202.5%)

Mutual labels: azure, spark

Azure Event Hubs

☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs

Stars: ✭ 233 (+482.5%)

Mutual labels: azure, spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (+645%)

Mutual labels: azure, spark

Zmonitor

Azure Multi-subscription/tenant Monitoring Solution

Stars: ✭ 35 (-12.5%)

Mutual labels: azure

Airflow On Kubernetes

Bare minimal Airflow on Kubernetes (Local, EKS, AKS)

Stars: ✭ 38 (-5%)

Mutual labels: azure

Vagrant Projects

Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR

Stars: ✭ 34 (-15%)

Mutual labels: spark

Partnercenterpowershellmodule

Partner Center PowerShell Module

Stars: ✭ 35 (-12.5%)

Mutual labels: azure

Maximerouiller.azure.appservice.easyauth

.NET Core integration of Azure AppService EasyAuth

Stars: ✭ 38 (-5%)

Mutual labels: azure

Recruit

직방 개발자 채용

Stars: ✭ 35 (-12.5%)

Mutual labels: azure

Iotz

compile things easy 🚀

Stars: ✭ 39 (-2.5%)

Mutual labels: azure

Azure.data.wrappers

Azure Storage Simplified

Stars: ✭ 34 (-15%)

Mutual labels: azure

Mldotnet Real Time Data Streaming Workshop

A Machine Learning and Real-Time Data Analytics Workshop

Stars: ✭ 34 (-15%)

Mutual labels: azure

Data Ingestion Platform

Stars: ✭ 39 (-2.5%)

Mutual labels: spark

Snappydata

Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster

Stars: ✭ 995 (+2387.5%)

Mutual labels: spark

View All Similar Projects ➔

Azure Data Explorer Connector for Apache Spark

master: dev:

This library contains the source code for Azure Data Explorer Data Source and Data Sink Connector for Apache Spark.

Azure Data Explorer (A.K.A. Kusto) is a lightning-fast indexing and querying service.

Spark is a unified analytics engine for large-scale data processing.

Making Azure Data Explorer and Spark work together enables building fast and scalable applications, targeting a variety of Machine Learning, Extract-Transform-Load, Log Analytics and other data driven scenarios.

Changelog

For main changes from previous releases please refer to Releases. For known or new issues please refer to the issues section.

Usage

Linking

Starting version 2.3.0 we introduce new artifact Ids: kusto-spark_3.x_2.12 targeting Spark 3.x and Scala 2.12 and kusto-spark_2.4_2.11 targeting Spark 2.4.x and scala 2.11. For Scala/Java applications using Maven project definitions, link your application with the artifact below in order to use the Azure Data Explorer connector for Spark.

Note: Versions prior to 2.5.1 do not work anymore for ingest to an existing table, please update to the latest.

groupId = com.microsoft.azure.kusto
artifactId = kusto-spark_3.0_2.12
version = 2.5.1

In Maven:

Look for the following coordinates:

com.microsoft.azure.kusto:kusto-spark_3.0_2.12:2.5.1

Or clone this repository and build it locally to add it to your local maven repository, the jar can also be found under the released package

  <dependency>
    <groupId>com.microsoft.azure.kusto</groupId>
    <artifactId>spark-kusto-connector</artifactId>
    <version>2.5.1</version>
  </dependency>

In Databricks:

Libraries -> Install New -> Maven -> copy the following coordinates:

com.microsoft.azure.kusto:kusto-spark_3.0_2.12:2.5.1

Building Samples Module

Samples are packaged as a separate module with the following artifact

<artifactId>connector-samples</artifactId>

In order to build the whole project comprised of the connector module and the samples module, use the following artifact:

<artifactId>azure-kusto-spark</artifactId>

Build Prerequisites

In order to use the connector, you need to have:

Java 1.8 SDK installed
Maven 3.x installed
Spark - with the respective version as the reflected by the artifact Id (either 2.4 or 3.0)

Note: when working with 2.3 Spark version or lower, build the jar locally from branch 2.4 and simply change the spark version in the pom file.

Build Commands

// Builds jar and runs all tests
mvn clean package

// Builds jar, runs all tests, and installs jar to your local maven repository
mvn clean install

Pre-Compiled Libraries

In order to facilitate ramp-up from local jar on platforms such as Azure Databricks, pre-compiled libraries are published under GitHub Releases. These libraries include:

Azure Data Explorer connector library
User may also need to include Kusto Java SDK libraries (kusto-data and kusto-ingest), which are published under GitHub Releases

Dependencies

Spark Azure Data Explorer connector takes dependency on Azure Data Explorer Data Client Library and Azure Data Explorer Ingest Client Library, available on maven repository. When Key Vault based authentication is used, there is an additional dependency on Microsoft Azure SDK For Key Vault.

Note: When working with JARs , Azure Data Explorer connector requires Azure Data Explorer java client libraries (and azure key-vault library if used) to be installed. To find the right version to install look in the relevant release's pom)

Documentation

Detailed documentation can be found here.

Samples

Usage examples can be found here

Available Azure Data Explorer client libraries:

Here is a list of currently available client libraries for Azure Data Explorer:

For the comfort of the user, here is a Pyspark sample for the connector.

Need Support?

Have a feature request for SDKs? Please post it on User Voice to help us prioritize
Have a technical question? Ask on Stack Overflow with tag "azure-data-explorer"
Need Support? Every customer with an active Azure subscription has access to support with guaranteed response time. Consider submitting a ticket and get assistance from Microsoft support team
Found a bug? Please help us fix it by thoroughly documenting it and filing an issue.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 40

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (7) 🔗