All Projects → arthur-redshift-etl → Similar Projects or Alternatives

281 Open source projects that are alternatives of or similar to arthur-redshift-etl

Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+22259.09%)
Mutual labels:  etl, data-engineering, elt
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+554.55%)
Mutual labels:  etl, data-engineering, elt
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (+259.09%)
Mutual labels:  etl, elt
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+250%)
Mutual labels:  etl, data-engineering
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+1168.18%)
Mutual labels:  etl, data-engineering
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (+100%)
Mutual labels:  etl, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+259.09%)
Mutual labels:  etl, data-engineering
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+195.45%)
Mutual labels:  etl, data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+259.09%)
Mutual labels:  etl, data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+1454.55%)
Mutual labels:  etl, data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+472.73%)
Mutual labels:  etl, data-engineering
rivery cli
Rivery CLI
Stars: ✭ 16 (-27.27%)
Mutual labels:  etl, elt
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (+68.18%)
Mutual labels:  etl, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-9.09%)
Mutual labels:  etl, data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+113.64%)
Mutual labels:  etl, data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+16740.91%)
Mutual labels:  etl, data-engineering
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (+36.36%)
Mutual labels:  etl, elt
wikirepo
Python based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (+50%)
Mutual labels:  etl, elt
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+140.91%)
Mutual labels:  etl, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (+9.09%)
Mutual labels:  etl, data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-36.36%)
Mutual labels:  etl, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+10740.91%)
Mutual labels:  etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+2777.27%)
Mutual labels:  etl, data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+2681.82%)
Mutual labels:  etl, data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+713.64%)
Mutual labels:  etl, data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+159.09%)
Mutual labels:  etl, data-engineering
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (+190.91%)
Mutual labels:  etl, data-engineering
ruby-for-pentaho-kettle
Ruby scripting for pentaho-kettle
Stars: ✭ 42 (+90.91%)
Mutual labels:  etl
koza
Data transformation framework for LinkML data models
Stars: ✭ 21 (-4.55%)
Mutual labels:  etl
persistity
A persistence framework for game developers
Stars: ✭ 34 (+54.55%)
Mutual labels:  etl
mlbgameday
Multi-core processing of 'Gameday' data from Major League Baseball Advanced Media. Additional tools to parallelize large data sets and write them to a database.
Stars: ✭ 37 (+68.18%)
Mutual labels:  etl
es2postgres
ElasticSearch to PostgreSQL loader
Stars: ✭ 18 (-18.18%)
Mutual labels:  etl
dswarm
an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 57 (+159.09%)
Mutual labels:  etl
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-36.36%)
Mutual labels:  etl
oesophagus
Enterprise Grade Single-Step Streaming Data Infrastructure Setup. (Under Development)
Stars: ✭ 12 (-45.45%)
Mutual labels:  etl
oic-options-chains
ETL for OIC Options Chains
Stars: ✭ 22 (+0%)
Mutual labels:  etl
gamechanger-data
GAMECHANGER aspires to be the Department’s trusted solution for evidence-based, data-driven decision-making across the universe of DoD requirements
Stars: ✭ 17 (-22.73%)
Mutual labels:  etl
spdr-etf-holdings
ETL for the SPDR ETF holdings XLS documents
Stars: ✭ 14 (-36.36%)
Mutual labels:  etl
dflib
In-memory Java DataFrame library
Stars: ✭ 50 (+127.27%)
Mutual labels:  etl
mydataharbor
🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (+27.27%)
Mutual labels:  etl
lineage
Generate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-27.27%)
Mutual labels:  etl
PDAP-Scrapers
Code relating to scraping public police data.
Stars: ✭ 72 (+227.27%)
Mutual labels:  etl
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+400%)
Mutual labels:  data-engineering
openrefine-client
The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Stars: ✭ 67 (+204.55%)
Mutual labels:  etl
DataEngineering
This repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+113.64%)
Mutual labels:  data-engineering
DataBridge.NET
Configurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-27.27%)
Mutual labels:  etl
go-bqloader
bqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
Stars: ✭ 16 (-27.27%)
Mutual labels:  etl
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+2054.55%)
Mutual labels:  elt
cobrix
A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Stars: ✭ 109 (+395.45%)
Mutual labels:  etl
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-4.55%)
Mutual labels:  etl
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (+0%)
Mutual labels:  etl
openrefine-docker
OpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Stars: ✭ 19 (-13.64%)
Mutual labels:  etl
yt-channels-DS-AI-ML-CS
A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+4618.18%)
Mutual labels:  data-engineering
DataXServer
为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能
Stars: ✭ 130 (+490.91%)
Mutual labels:  etl
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+9.09%)
Mutual labels:  etl
neon-workshop
A Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (-13.64%)
Mutual labels:  data-engineering
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-22.73%)
Mutual labels:  etl
wrangle
A data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Stars: ✭ 15 (-31.82%)
Mutual labels:  etl
mik
The Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
Stars: ✭ 32 (+45.45%)
Mutual labels:  etl
TEAM
The Taxonomy for ETL Automation Metadata (TEAM) is a metadata management tool for data warehouse automation. It is part of the ecosystem for data warehouse automation, alongside the Virtual Data Warehouse pattern manager and the generic schema for Data Warehouse Automation.
Stars: ✭ 27 (+22.73%)
Mutual labels:  etl
1-60 of 281 similar projects