All Projects → Airflow Pipeline → Similar Projects or Alternatives

666 Open source projects that are alternatives of or similar to Airflow Pipeline

Base
https://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-62.5%)
Mutual labels:  hadoop
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (+146.88%)
Mutual labels:  spark
Elephas
Distributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1088.28%)
Mutual labels:  spark
Cook
Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Stars: ✭ 314 (+145.31%)
Mutual labels:  spark
Spark As Service Using Embedded Server
This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server
Stars: ✭ 46 (-64.06%)
Mutual labels:  spark
Hadoop Book
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
Stars: ✭ 3,317 (+2491.41%)
Mutual labels:  hadoop
Ammonite Spark
Run spark calculations from Ammonite
Stars: ✭ 88 (-31.25%)
Mutual labels:  spark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (+139.84%)
Mutual labels:  spark
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+700.78%)
Mutual labels:  hadoop
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (+0%)
Mutual labels:  hadoop
Dynamometer
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-4.69%)
Mutual labels:  hadoop
Asakusafw
Asakusa Framework
Stars: ✭ 114 (-10.94%)
Mutual labels:  hadoop
Spark Terasort
Spark Terasort
Stars: ✭ 101 (-21.09%)
Mutual labels:  spark
Src
A light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-47.66%)
Mutual labels:  hadoop
Dev Setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+4267.19%)
Mutual labels:  spark
Airflow Tutorial
Airflow basics tutorial
Stars: ✭ 305 (+138.28%)
Mutual labels:  airflow
Spark Examples
Spark examples
Stars: ✭ 41 (-67.97%)
Mutual labels:  spark
Awesome Ada
A curated list of awesome resources related to the Ada and SPARK programming language
Stars: ✭ 299 (+133.59%)
Mutual labels:  spark
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (-32.03%)
Mutual labels:  spark
Azure Kusto Spark
Apache Spark Connector for Azure Kusto
Stars: ✭ 40 (-68.75%)
Mutual labels:  spark
Spark Hbase Connector
Connect Spark to HBase for reading and writing data with ease
Stars: ✭ 299 (+133.59%)
Mutual labels:  spark
Whirl
Fast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (-13.28%)
Mutual labels:  airflow
Behemoth
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (+123.44%)
Mutual labels:  hadoop
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+679.69%)
Mutual labels:  spark
Spark Druid Olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 282 (+120.31%)
Mutual labels:  spark
Laravel Spark Google2fa
Google Authenticator support for Laravel Spark
Stars: ✭ 86 (-32.81%)
Mutual labels:  spark
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+3478.91%)
Mutual labels:  hadoop
Snappydata
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Stars: ✭ 995 (+677.34%)
Mutual labels:  spark
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+117.19%)
Mutual labels:  spark
Teddy
Spark Streaming监控平台,支持任务部署与告警、自启动
Stars: ✭ 120 (-6.25%)
Mutual labels:  spark
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (+112.5%)
Mutual labels:  spark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+670.31%)
Mutual labels:  spark
Helk
The Hunting ELK
Stars: ✭ 3,097 (+2319.53%)
Mutual labels:  spark
Flint
Webex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-33.59%)
Mutual labels:  spark
Introtohadoopandmr udacity course
🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"
Stars: ✭ 110 (-14.06%)
Mutual labels:  hadoop
Big Data Rosetta Code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (+98.44%)
Mutual labels:  spark
Jsr203 Hadoop
A Java NIO file system provider for HDFS
Stars: ✭ 35 (-72.66%)
Mutual labels:  hadoop
laravel-spark-camera
Profile Photo Camera support for Laravel Spark
Stars: ✭ 30 (-76.56%)
Mutual labels:  spark
Airflow Training
Airflow training for the crunch conf
Stars: ✭ 83 (-35.16%)
Mutual labels:  airflow
Book
本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-63.28%)
Mutual labels:  spark
Objinsync
Continuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-77.34%)
Mutual labels:  airflow
Spark Bigquery Connector
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Stars: ✭ 126 (-1.56%)
Mutual labels:  spark
Thingsboard
Open-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+8123.44%)
Mutual labels:  spark
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+378.91%)
Mutual labels:  hadoop
Akkeeper
An easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-76.56%)
Mutual labels:  hadoop
dllib
dllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-75%)
Mutual labels:  spark
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+377.34%)
Mutual labels:  spark
pulse
phData Pulse application log aggregation and monitoring
Stars: ✭ 13 (-89.84%)
Mutual labels:  hadoop
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+645.31%)
Mutual labels:  spark
dbt-on-airflow
No description or website provided.
Stars: ✭ 30 (-76.56%)
Mutual labels:  airflow
Java learning practice
java 进阶之路:面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等
Stars: ✭ 110 (-14.06%)
Mutual labels:  spark
Spark Ffm
FFM (Field-Awared Factorization Machine) on Spark
Stars: ✭ 101 (-21.09%)
Mutual labels:  spark
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-49.22%)
Mutual labels:  spark
Incubator Dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Stars: ✭ 6,916 (+5303.13%)
Mutual labels:  airflow
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+4207.03%)
Mutual labels:  spark
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-49.22%)
Mutual labels:  spark
Mongo Spark
The MongoDB Spark Connector
Stars: ✭ 588 (+359.38%)
Mutual labels:  spark
Spark Lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-10.94%)
Mutual labels:  spark
Jumbune
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-50%)
Mutual labels:  hadoop
Hadoop study
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (+342.97%)
Mutual labels:  hadoop
301-360 of 666 similar projects