All Projects → seatunnel-example → Similar Projects or Alternatives

201 Open source projects that are alternatives of or similar to seatunnel-example

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+6774.07%)

Mutual labels: spark-streaming, flink, sql-engine, etl-framework, etl-pipeline

Streamline

StreamLine - Streaming Analytics

Stars: ✭ 151 (+459.26%)

Mutual labels: spark-streaming, flink

csvplus

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

Stars: ✭ 67 (+148.15%)

Mutual labels: etl-framework, etl-pipeline

fdp-modelserver

An umbrella project for multiple implementations of model serving

Stars: ✭ 47 (+74.07%)

Mutual labels: spark-streaming, flink

DIRECT

DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.

Stars: ✭ 20 (-25.93%)

Mutual labels: etl-framework, etl-pipeline

redis-connect-dist

Real-Time Event Streaming & Change Data Capture

Stars: ✭ 21 (-22.22%)

Mutual labels: etl-framework, etl-pipeline

litemall-dw

基于开源Litemall电商项目的大数据项目，包含前端埋点(openresty+lua)、后端埋点；数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化)，同时也包含了Azkaban的workflow。

Stars: ✭ 36 (+33.33%)

Mutual labels: spark-streaming, flink

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (-11.11%)

Mutual labels: etl-framework, etl-pipeline

vixtract

www.vixtract.ru

Stars: ✭ 40 (+48.15%)

Mutual labels: etl-framework, etl-pipeline

hamilton

A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.

Stars: ✭ 612 (+2166.67%)

Mutual labels: etl-framework, etl-pipeline

Streaming Readings

Streaming System 相关的论文读物

Stars: ✭ 554 (+1951.85%)

Mutual labels: spark-streaming, flink

etlflow

EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.

Stars: ✭ 38 (+40.74%)

Mutual labels: etl-framework, etl-pipeline

Sylph

Stream computing platform for bigdata

Stars: ✭ 362 (+1240.74%)

Mutual labels: spark-streaming, flink

open-stream-processing-benchmark

This repository contains the code base for the Open Stream Processing Benchmark.

Stars: ✭ 37 (+37.04%)

Mutual labels: spark-streaming, flink

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+44.44%)

Mutual labels: etl-framework, etl-pipeline

cassandra.realtime

Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink

Stars: ✭ 25 (-7.41%)

Mutual labels: spark-streaming, flink

Registry

Schema Registry

Stars: ✭ 184 (+581.48%)

Mutual labels: spark-streaming, flink

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+555.56%)

Mutual labels: spark-streaming

Real-time-log-analysis-system

🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)

Stars: ✭ 31 (+14.81%)

Mutual labels: spark-streaming

Movie recommend

基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统

Stars: ✭ 2,092 (+7648.15%)

Mutual labels: spark-streaming

Pyspark Learning

Updated repository

Stars: ✭ 147 (+444.44%)

Mutual labels: spark-streaming

SANSA-Stack

Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/

Stars: ✭ 130 (+381.48%)

Mutual labels: flink

FlinkExperiments

Experiments with Apache Flink.

Stars: ✭ 3 (-88.89%)

Mutual labels: flink

Spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Stars: ✭ 1,721 (+6274.07%)

Mutual labels: spark-streaming

Kinesis Sql

Kinesis Connector for Structured Streaming

Stars: ✭ 120 (+344.44%)

Mutual labels: spark-streaming

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (+37.04%)

Mutual labels: flink

Spark Mllib Twitter Sentiment Analysis

🌟 ✨ Analyze and visualize Twitter Sentiment on a world map using Spark MLlib

Stars: ✭ 113 (+318.52%)

Mutual labels: spark-streaming

Spark Streaming With Kafka

Self-contained examples of Apache Spark streaming integrated with Apache Kafka.

Stars: ✭ 180 (+566.67%)

Mutual labels: spark-streaming

flink-deployer

A tool that help automate deployment to an Apache Flink cluster

Stars: ✭ 143 (+429.63%)

Mutual labels: flink

Scramjet

Simple yet powerful live data computation framework

Stars: ✭ 171 (+533.33%)

Mutual labels: spark-streaming

flink-connector-kudu

基于Apache-bahir-kudu-connector的flink-connector-kudu，支持Flink1.11.x DynamicTableSource/Sink，支持Range分区等

Stars: ✭ 40 (+48.15%)

Mutual labels: flink

flink-client

Java library for managing Apache Flink via the Monitoring REST API

Stars: ✭ 48 (+77.78%)

Mutual labels: flink

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (+418.52%)

Mutual labels: spark-streaming

T-Watch

Real Time Twitter Sentiment Analysis Product

Stars: ✭ 20 (-25.93%)

Mutual labels: spark-streaming

Example Spark Kafka

Apache Spark and Apache Kafka integration example

Stars: ✭ 120 (+344.44%)

Mutual labels: spark-streaming

flink-spark-submiter

从本地IDEA提交Flink/Spark任务到Yarn/k8s集群

Stars: ✭ 157 (+481.48%)

Mutual labels: flink

Pyspark Examples

Code examples on Apache Spark using python

Stars: ✭ 58 (+114.81%)

Mutual labels: spark-streaming

dlink

Dinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.

Stars: ✭ 1,535 (+5585.19%)

Mutual labels: flink

TiBigData

TiDB connectors for Flink/Hive/Presto

Stars: ✭ 192 (+611.11%)

Mutual labels: flink

Lidea

大型分布式系统实时监控平台

Stars: ✭ 28 (+3.7%)

Mutual labels: flink

Real Time Stream Processing Engine

This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.

Stars: ✭ 37 (+37.04%)

Mutual labels: spark-streaming

Spark States

Custom state store providers for Apache Spark

Stars: ✭ 83 (+207.41%)

Mutual labels: spark-streaming

Utils4s

scala、spark使用过程中，各种测试用例以及相关资料整理

Stars: ✭ 1,070 (+3862.96%)

Mutual labels: spark-streaming

Learning Spark

零基础学习spark，大数据学习

Stars: ✭ 37 (+37.04%)

Mutual labels: spark-streaming

AirflowETL

Blog post on ETL pipelines with Airflow

Stars: ✭ 20 (-25.93%)

Mutual labels: etl-pipeline

Project Fortis

Repository for all parts of the Fortis architecture

Stars: ✭ 27 (+0%)

Mutual labels: spark-streaming

Spark Streaming Monitoring With Lightning

Plot live-stats as graph from ApacheSpark application using Lightning-viz

Stars: ✭ 15 (-44.44%)

Mutual labels: spark-streaming

dpkb

大数据相关内容汇总，包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词：Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse

Stars: ✭ 123 (+355.56%)

Mutual labels: flink

ExDeMon

A general purpose metrics monitor implemented with Apache Spark. Kafka source, Elastic sink, aggregate metrics, different analysis, notifications, actions, live configuration update, missing metrics, ...

Stars: ✭ 19 (-29.63%)

Mutual labels: spark-streaming

Wormhole

Wormhole is a SPaaS (Stream Processing as a Service) Platform

Stars: ✭ 863 (+3096.3%)

Mutual labels: spark-streaming

Mobius

C# and F# language binding and extensions to Apache Spark

Stars: ✭ 929 (+3340.74%)

Mutual labels: spark-streaming

bigdatatutorial

Stars: ✭ 34 (+25.93%)

Mutual labels: spark-streaming

Bandar Log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.

Stars: ✭ 19 (-29.63%)

Mutual labels: spark-streaming

Tweet-Analysis-With-Kafka-and-Spark

A real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.

Stars: ✭ 18 (-33.33%)

Mutual labels: spark-streaming

Spark ALS

基于spark-ml,spark-mllib,spark-streaming的推荐算法实现

Stars: ✭ 89 (+229.63%)

Mutual labels: spark-streaming

Angel

A Flexible and Powerful Parameter Server for large-scale machine learning

Stars: ✭ 6,458 (+23818.52%)

Mutual labels: spark-streaming

Sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Stars: ✭ 513 (+1800%)

Mutual labels: spark-streaming

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+814.81%)

Mutual labels: spark-streaming

Cdap

An open source framework for building data analytic applications.

Stars: ✭ 509 (+1785.19%)

Mutual labels: spark-streaming

link-move

A model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.

Stars: ✭ 32 (+18.52%)

Mutual labels: etl-framework

1-60 of 201 similar projects

›