Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+751.72%)

Mutual labels: spark

Big Data Rosetta Code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

Stars: ✭ 254 (+775.86%)

Mutual labels: spark

Neo4j Spark Connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs

Stars: ✭ 245 (+744.83%)

Mutual labels: spark

High Performance Spark Examples

Examples for High Performance Spark

Stars: ✭ 436 (+1403.45%)

Mutual labels: spark

Recommendationsystem

Book recommender system using collaborative filtering based on Spark

Stars: ✭ 244 (+741.38%)

Mutual labels: spark

laravel-spark-camera

Profile Photo Camera support for Laravel Spark

Stars: ✭ 30 (+3.45%)

Mutual labels: spark

Hadoop Docker

基于Docker构建的Hadoop开发测试环境，包含Hadoop，Hive，HBase，Spark

Stars: ✭ 238 (+720.69%)

Mutual labels: spark

Angel

A Flexible and Powerful Parameter Server for large-scale machine learning

Stars: ✭ 6,458 (+22168.97%)

Mutual labels: spark

Mastering Spark Sql Book

The Internals of Spark SQL

Stars: ✭ 234 (+706.9%)

Mutual labels: spark

Book

本项目收藏这些年来看过或者听过的一些不错的书籍，在整理文件时看见这些，发现删掉有点可惜，放着又太浪费空间，本着分享的原则，就把它们共享出来，一方面给需要的读者提供这些书籍，另一方面也是一种像知识库的积累吧

Stars: ✭ 47 (+62.07%)

Mutual labels: spark

Mydatascienceportfolio

Applying Data Science and Machine Learning to Solve Real World Business Problems

Stars: ✭ 227 (+682.76%)

Mutual labels: spark

Dji Firmware Tools

Tools for handling firmwares of DJI products, with focus on quadcopters.

Stars: ✭ 424 (+1362.07%)

Mutual labels: spark

Spark Workshop

Apache Spark™ and Scala Workshops

Stars: ✭ 224 (+672.41%)

Mutual labels: spark

spark-http-stream

spark structured streaming via HTTP communication

Stars: ✭ 17 (-41.38%)

Mutual labels: spark

Sagemaker Spark

A Spark library for Amazon SageMaker.

Stars: ✭ 219 (+655.17%)

Mutual labels: spark

Chronicler

Scala toolchain for InfluxDB

Stars: ✭ 24 (-17.24%)

Mutual labels: spark

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (+644.83%)

Mutual labels: spark

daf-kylo

Kylo integration with PDND (previously DAF).

Stars: ✭ 20 (-31.03%)

Mutual labels: spark

Hydro Serving

MLOps Platform

Stars: ✭ 213 (+634.48%)

Mutual labels: spark

Featran

A Scala feature transformation library for data science and machine learning

Stars: ✭ 420 (+1348.28%)

Mutual labels: spark

Spark Knn

k-Nearest Neighbors algorithm on Spark

Stars: ✭ 205 (+606.9%)

Mutual labels: spark

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (+141.38%)

Mutual labels: spark

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (+9896.55%)

Mutual labels: spark

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (+2468.97%)

Mutual labels: spark

Ballista

Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Stars: ✭ 2,274 (+7741.38%)

Mutual labels: spark

spark-data-sources

Developing Spark External Data Sources using the V2 API

Stars: ✭ 36 (+24.14%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (+544.83%)

Mutual labels: spark

Listenbrainz Server

Server for the ListenBrainz project

Stars: ✭ 420 (+1348.28%)

Mutual labels: spark

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (+531.03%)

Mutual labels: spark

Spark Streaming With Kafka

Self-contained examples of Apache Spark streaming integrated with Apache Kafka.

Stars: ✭ 180 (+520.69%)

Mutual labels: spark

Sparkling Titanic

Training models with Apache Spark, PySpark for Titanic Kaggle competition

Stars: ✭ 12 (-58.62%)

Mutual labels: spark

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (+506.9%)

Mutual labels: spark

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (+124.14%)

Mutual labels: spark

Kraps Rpc

A RPC framework leveraging Spark RPC module

Stars: ✭ 175 (+503.45%)

Mutual labels: spark

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+1324.14%)

Mutual labels: spark

Spark Nlp

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+8582.76%)

Mutual labels: spark

SparkV

🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.

Stars: ✭ 24 (-17.24%)

Mutual labels: spark

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+7086.21%)

Mutual labels: spark

Cdhproject

hadoop各组件使用，持续更新

Stars: ✭ 733 (+2427.59%)

Mutual labels: spark

odbc2parquet

A command line tool to query an ODBC data source and write the result into a parquet file.

Stars: ✭ 95 (+227.59%)

Mutual labels: parquet

Pystore

Fast data store for Pandas time-series data

Stars: ✭ 325 (+1020.69%)

Mutual labels: parquet

IMCtermite

Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats

Stars: ✭ 20 (-31.03%)

Mutual labels: parquet

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Stars: ✭ 19 (-34.48%)

Mutual labels: parquet

Sparklearning

Learning Apache spark,including code and data .Most part can run local.

Stars: ✭ 558 (+1824.14%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.