All Projects → spark-extension → Similar Projects or Alternatives

456 Open source projects that are alternatives of or similar to spark-extension

Listenbrainz Server
Server for the ListenBrainz project
Stars: ✭ 420 (+1580%)
Mutual labels:  spark
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+1552%)
Mutual labels:  spark
Hadoop Docker
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (+852%)
Mutual labels:  spark
Marmaray
Generic Data Ingestion & Dispersal Library for Hadoop
Stars: ✭ 414 (+1556%)
Mutual labels:  spark
Teddy
Spark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (+380%)
Mutual labels:  spark
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (+188%)
Mutual labels:  spark
machine-learning-course
Machine Learning Course @ Santa Clara University
Stars: ✭ 17 (-32%)
Mutual labels:  pyspark
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (+1500%)
Mutual labels:  spark
Elassandra
Elassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+6340%)
Mutual labels:  spark
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+80488%)
Mutual labels:  spark
Mastering Spark Sql Book
The Internals of Spark SQL
Stars: ✭ 234 (+836%)
Mutual labels:  spark
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+15152%)
Mutual labels:  spark
Spark Lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (+356%)
Mutual labels:  spark
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+1388%)
Mutual labels:  spark
Casper
A compiler for automatically re-targeting sequential Java code to Apache Spark.
Stars: ✭ 45 (+80%)
Mutual labels:  spark
Sparkmeasure
This is the development repository of SparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
Stars: ✭ 368 (+1372%)
Mutual labels:  spark
Spark Mllib Twitter Sentiment Analysis
🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
Stars: ✭ 113 (+352%)
Mutual labels:  spark
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+1352%)
Mutual labels:  spark
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+808%)
Mutual labels:  spark
Sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+1348%)
Mutual labels:  spark
Python Bigdata
Data science and Big Data with Python
Stars: ✭ 112 (+348%)
Mutual labels:  spark
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+1272%)
Mutual labels:  spark
DataEngineering
This repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+88%)
Mutual labels:  pyspark
Scalnet
A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs
Stars: ✭ 342 (+1268%)
Mutual labels:  spark
Elephas
Distributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+5984%)
Mutual labels:  spark
Ytk Learn
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (+1248%)
Mutual labels:  spark
Spark Workshop
Apache Spark™ and Scala Workshops
Stars: ✭ 224 (+796%)
Mutual labels:  spark
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+1164%)
Mutual labels:  spark
Waterdrop
Production Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+7324%)
Mutual labels:  spark
Clickhouse Native Jdbc
ClickHouse Native Protocol JDBC implementation
Stars: ✭ 310 (+1140%)
Mutual labels:  spark
Search Ads Web Service
Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (+20%)
Mutual labels:  spark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (+1128%)
Mutual labels:  spark
Bigdataclass
Two-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (+340%)
Mutual labels:  spark
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+15512%)
Mutual labels:  spark
Sagemaker Spark
A Spark library for Amazon SageMaker.
Stars: ✭ 219 (+776%)
Mutual labels:  spark
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (+1112%)
Mutual labels:  spark
Distributed Dataset
A distributed data processing framework in Haskell.
Stars: ✭ 108 (+332%)
Mutual labels:  spark
Springboard-Data-Science-Immersive
No description or website provided.
Stars: ✭ 52 (+108%)
Mutual labels:  pyspark
pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+28%)
Mutual labels:  pyspark
Spark Structured Streaming Examples
Spark Structured Streaming / Kafka / Cassandra / Elastic
Stars: ✭ 168 (+572%)
Mutual labels:  spark
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+184%)
Mutual labels:  spark
Spark Notebook
Interactive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+12224%)
Mutual labels:  spark
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+1012%)
Mutual labels:  spark
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (+988%)
Mutual labels:  spark
Seldon Server
Machine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+5640%)
Mutual labels:  spark
Docker Spark Cluster
A simple spark standalone cluster for your testing environment purposses
Stars: ✭ 261 (+944%)
Mutual labels:  spark
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (+44%)
Mutual labels:  spark
Sk Dist
Distributed scikit-learn meta-estimators in PySpark
Stars: ✭ 260 (+940%)
Mutual labels:  spark
Spark On K8s Operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+7020%)
Mutual labels:  spark
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (+928%)
Mutual labels:  spark
Hydro Serving
MLOps Platform
Stars: ✭ 213 (+752%)
Mutual labels:  spark
Usersessionbehaviorofflineanalysis
四川大学拓思爱诺用户session行为数据离线分析项目
Stars: ✭ 69 (+176%)
Mutual labels:  spark
Fast Mrmr
An improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
Stars: ✭ 67 (+168%)
Mutual labels:  spark
Splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (+320%)
Mutual labels:  spark
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-44%)
Mutual labels:  spark
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (+20%)
Mutual labels:  pyspark
Kontextfrei
Writing application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (+168%)
Mutual labels:  spark
docker-spark
Apache Spark docker container image (Standalone mode)
Stars: ✭ 34 (+36%)
Mutual labels:  spark
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+280%)
Mutual labels:  spark
pyspark-for-data-processing
Code for my presentation: Using PySpark to Process Boat Loads of Data
Stars: ✭ 20 (-20%)
Mutual labels:  pyspark
361-420 of 456 similar projects