All Projects → Airflow Pipeline → Similar Projects or Alternatives

666 Open source projects that are alternatives of or similar to Airflow Pipeline

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-53.12%)

Mutual labels: spark, hadoop

Almond

A Scala kernel for Jupyter

Stars: ✭ 1,354 (+957.81%)

Mutual labels: spark

Spring Shiro Spark

Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试

Stars: ✭ 114 (-10.94%)

Mutual labels: spark

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-24.22%)

Mutual labels: spark

Schemer

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

Stars: ✭ 97 (-24.22%)

Mutual labels: spark

Deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Stars: ✭ 2,020 (+1478.13%)

Mutual labels: spark

Parquet Go

Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.

Stars: ✭ 114 (-10.94%)

Mutual labels: hadoop

Relation extraction

Relation Extraction using Deep learning(CNN)

Stars: ✭ 96 (-25%)

Mutual labels: spark

Spark Mllib Twitter Sentiment Analysis

🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Stars: ✭ 113 (-11.72%)

Mutual labels: spark

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+945.31%)

Mutual labels: spark

Zparkio

Boiler plate framework to use Spark and ZIO together.

Stars: ✭ 121 (-5.47%)

Mutual labels: spark

Wifi

基于wifi抓取信息的大数据查询分析系统

Stars: ✭ 93 (-27.34%)

Mutual labels: hadoop

Spark Summit 2017 Sanfrancisco

spark summit 2017 SanFrancisco

Stars: ✭ 93 (-27.34%)

Mutual labels: spark

Big Data

🔧 Use dplyr to analyze Big Data 🐘

Stars: ✭ 93 (-27.34%)

Mutual labels: spark

Python Bigdata

Data science and Big Data with Python

Stars: ✭ 112 (-12.5%)

Mutual labels: spark

Hadoop Yarn Api Python Client

Python client for Hadoop® YARN API

Stars: ✭ 91 (-28.91%)

Mutual labels: hadoop

Spark On Kubernetes Helm

Spark on Kubernetes infrastructure Helm charts repo

Stars: ✭ 92 (-28.12%)

Mutual labels: spark

Eat pyspark in 10 days

pyspark🍒🥭 is delicious，just eat it!😋😋

Stars: ✭ 116 (-9.37%)

Mutual labels: spark

Archivespark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Stars: ✭ 111 (-13.28%)

Mutual labels: spark

Bitnami Docker Airflow

Bitnami Docker Image for Apache Airflow

Stars: ✭ 89 (-30.47%)

Mutual labels: airflow

Elephas

Distributed Deep learning with Keras & Spark

Stars: ✭ 1,521 (+1088.28%)

Mutual labels: spark

Hadoop Mapreduce

Mirror of Apache Hadoop MapReduce

Stars: ✭ 88 (-31.25%)

Mutual labels: hadoop

Ammonite Spark

Run spark calculations from Ammonite

Stars: ✭ 88 (-31.25%)

Mutual labels: spark

Griffon Vm

Griffon Data Science Virtual Machine

Stars: ✭ 128 (+0%)

Mutual labels: hadoop

Parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Stars: ✭ 125 (-2.34%)

Mutual labels: hadoop

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (-6.25%)

Mutual labels: spark

Lambda Arch

Applying Lambda Architecture with Spark, Kafka, and Cassandra.

Stars: ✭ 111 (-13.28%)

Mutual labels: spark

Spark Nlp Models

Models and Pipelines for the Spark NLP library

Stars: ✭ 88 (-31.25%)

Mutual labels: spark

Spark python ml examples

Spark 2.0 Python Machine Learning examples

Stars: ✭ 87 (-32.03%)

Mutual labels: spark

Whirl

Fast iterative local development and testing of Apache Airflow workflows

Stars: ✭ 111 (-13.28%)

Mutual labels: airflow

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (-35.94%)

Mutual labels: airflow

Laravel Spark Google2fa

Google Authenticator support for Laravel Spark

Stars: ✭ 86 (-32.81%)

Mutual labels: spark

Teddy

Spark Streaming监控平台，支持任务部署与告警、自启动

Stars: ✭ 120 (-6.25%)

Mutual labels: spark

Avro Hadoop Starter

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Stars: ✭ 110 (-14.06%)

Mutual labels: hadoop

Cuesheet

A framework for writing Spark 2.x applications in a pretty way

Stars: ✭ 86 (-32.81%)

Mutual labels: spark

Flint

Webex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)

Stars: ✭ 85 (-33.59%)

Mutual labels: spark

Introtohadoopandmr udacity course

🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"

Stars: ✭ 110 (-14.06%)

Mutual labels: hadoop

Hops Examples

Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops

Stars: ✭ 84 (-34.37%)

Mutual labels: spark

Spark Bigquery Connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

Stars: ✭ 126 (-1.56%)

Mutual labels: spark

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (-6.25%)

Mutual labels: spark

Airflow Training

Airflow training for the crunch conf

Stars: ✭ 83 (-35.16%)

Mutual labels: airflow

Spark States

Custom state store providers for Apache Spark

Stars: ✭ 83 (-35.16%)

Mutual labels: spark

Java learning practice

java 进阶之路：面试高频算法、akka、多线程、NIO、Netty、SpringBoot、Spark&&Flink 等

Stars: ✭ 110 (-14.06%)

Mutual labels: spark

Docker Hadoop Cluster

Multiple node cluster on Docker for self development.

Stars: ✭ 82 (-35.94%)

Mutual labels: hadoop

Spark Dependencies

Spark job for dependency links

Stars: ✭ 82 (-35.94%)

Mutual labels: spark

Elassandra

Elassandra = Elasticsearch + Apache Cassandra

Stars: ✭ 1,610 (+1157.81%)

Mutual labels: spark

Bigdataclass

Two-day workshop that covers how to use R to interact databases and Spark

Stars: ✭ 110 (-14.06%)

Mutual labels: spark

Camus

Mirror of Linkedin's Camus

Stars: ✭ 81 (-36.72%)

Mutual labels: hadoop

Mleap

MLeap: Deploy ML Pipelines to Production

Stars: ✭ 1,232 (+862.5%)

Mutual labels: spark

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-14.84%)

Mutual labels: spark

Lehar

Visualize data using relative ordering

Stars: ✭ 81 (-36.72%)

Mutual labels: spark

Spark Gbtlr

Hybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark

Stars: ✭ 81 (-36.72%)

Mutual labels: spark

Spring Boot Quick

🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如：rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、spring-batch、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等📌

Stars: ✭ 1,819 (+1321.09%)

Mutual labels: spark

Openuba

A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]

Stars: ✭ 127 (-0.78%)

Mutual labels: spark

Scala Samples

There are pieces of scala code that explain Scala syntax and related things - like what you can do with all this

Stars: ✭ 125 (-2.34%)

Mutual labels: spark

Hdfs Shell

HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS

Stars: ✭ 117 (-8.59%)

Mutual labels: hadoop

Airflow in docker compose

Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)