datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-72.92%)
Mutual labels: big-data, apache-spark, etl, etl-framework
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+150.69%)
Mutual labels: big-data, etl, etl-framework
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-83.33%)
Mutual labels: apache-spark, etl, etl-framework
Bandar LogMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (-86.81%)
Mutual labels: big-data, etl
EtlalchemyExtract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (+219.44%)
Mutual labels: etl, etl-framework
Getting StartedThis repository is a getting started guide to Singer.
Stars: ✭ 734 (+409.72%)
Mutual labels: etl, etl-framework
MorpheusMorpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (+110.42%)
Mutual labels: big-data, apache-spark
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-45.14%)
Mutual labels: big-data, etl
Pyetlpython ETL framework
Stars: ✭ 33 (-77.08%)
Mutual labels: etl, etl-framework
Spark On LambdaApache Spark on AWS Lambda
Stars: ✭ 137 (-4.86%)
Mutual labels: big-data, apache-spark
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (-13.19%)
Mutual labels: etl, etl-framework
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+158.33%)
Mutual labels: etl, etl-framework
Scala Spark TutorialProject for James' Apache Spark with Scala course
Stars: ✭ 121 (-15.97%)
Mutual labels: big-data, apache-spark
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (-13.19%)
Mutual labels: etl, etl-framework
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+450.69%)
Mutual labels: apache-spark, etl-framework
MistServerless proxy for Spark cluster
Stars: ✭ 309 (+114.58%)
Mutual labels: big-data, apache-spark
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (-55.56%)
Mutual labels: etl, etl-framework
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-11.11%)
Mutual labels: big-data, apache-spark
Parquet Dotnet🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+91.67%)
Mutual labels: big-data, apache-spark
SmooksAn extensible Java framework for building XML and non-XML streaming applications
Stars: ✭ 293 (+103.47%)
Mutual labels: big-data, etl