All Projects → etl → Similar Projects or Alternatives

333 Open source projects that are alternatives of or similar to etl

morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (-72.4%)
Mutual labels:  etl, data-engineering
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (-84.23%)
Mutual labels:  etl, data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-83.15%)
Mutual labels:  etl, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-91.4%)
Mutual labels:  etl, data-engineering
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-93.91%)
Mutual labels:  etl, data-processing
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-86.74%)
Mutual labels:  etl, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-92.83%)
Mutual labels:  etl, data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-79.57%)
Mutual labels:  etl, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+754.84%)
Mutual labels:  etl, data-engineering
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-48.39%)
Mutual labels:  etl, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-71.68%)
Mutual labels:  etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+126.88%)
Mutual labels:  etl, data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+1227.96%)
Mutual labels:  etl, data-engineering
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+1663.08%)
Mutual labels:  etl, data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-94.98%)
Mutual labels:  etl, data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+22.58%)
Mutual labels:  etl, data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-54.84%)
Mutual labels:  etl, data-engineering
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-76.7%)
Mutual labels:  etl, data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-81%)
Mutual labels:  etl, data-engineering
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-92.11%)
Mutual labels:  etl, data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+209.68%)
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+119.35%)
Mutual labels:  etl, data-engineering
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (-77.06%)
Mutual labels:  etl, data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-71.68%)
Mutual labels:  etl, data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (-35.84%)
Mutual labels:  etl, data-engineering
Bender
Bender - Serverless ETL Framework
Stars: ✭ 171 (-38.71%)
Mutual labels:  etl
Bitcoin Etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (-37.63%)
Mutual labels:  etl
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-60.22%)
Mutual labels:  data-engineering
Data Making Guidelines
📘 Making Data, the DataMade Way
Stars: ✭ 248 (-11.11%)
Mutual labels:  etl
Linq2db
Linq to database provider.
Stars: ✭ 2,211 (+692.47%)
Mutual labels:  etl
Aws Etl Orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (-12.19%)
Mutual labels:  etl
Usaspending Api
Server application to serve U.S. federal spending data via a RESTful API
Stars: ✭ 166 (-40.5%)
Mutual labels:  etl
Open Semantic Etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (-40.86%)
Mutual labels:  etl
Etl unicorn
数据可视化, 数据挖掘, 数据处理 ETL
Stars: ✭ 156 (-44.09%)
Mutual labels:  etl
pentaho-gis-plugins
🗺 GIS plugins for Pentaho Data Integration
Stars: ✭ 42 (-84.95%)
Mutual labels:  etl
NBi
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (-63.44%)
Mutual labels:  etl
Example Airflow Dags
Example DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (-12.9%)
Mutual labels:  etl
Mara Example Project 2
An example mini data warehouse for python project stats, template for new projects
Stars: ✭ 154 (-44.8%)
Mutual labels:  etl
Metl
mito ETL tool
Stars: ✭ 153 (-45.16%)
Mutual labels:  etl
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (-15.77%)
Mutual labels:  etl
Omniparser
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (-46.95%)
Mutual labels:  etl
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (-48.39%)
Mutual labels:  etl
Storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (-16.85%)
Mutual labels:  etl
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (-49.82%)
Mutual labels:  etl
Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+559.86%)
Mutual labels:  etl
Etl2pcapng
Utility that converts an .etl file containing a Windows network packet capture into .pcapng format.
Stars: ✭ 228 (-18.28%)
Mutual labels:  etl
Kettle Web
基于spring boot通过java代码调用kette
Stars: ✭ 128 (-54.12%)
Mutual labels:  etl
Reddit Detective
Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (-53.76%)
Mutual labels:  etl
DataX-src
DataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-92.47%)
Mutual labels:  etl
dbt-databricks
A dbt adapter for Databricks.
Stars: ✭ 115 (-58.78%)
Mutual labels:  etl
openmrs-fhir-analytics
A collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (-80.29%)
Mutual labels:  etl
Elastic
R client for the Elasticsearch HTTP API
Stars: ✭ 227 (-18.64%)
Mutual labels:  etl
Etl.net
Mass processing data with a complete ETL for .net developers
Stars: ✭ 129 (-53.76%)
Mutual labels:  etl
Bulk Writer
Provides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (-24.73%)
Mutual labels:  etl
Transformalize
Configurable Extract, Transform, and Load
Stars: ✭ 125 (-55.2%)
Mutual labels:  etl
id3c
Data logistics system enabling real-time pathogen surveillance. Built for the Seattle Flu Study.
Stars: ✭ 21 (-92.47%)
Mutual labels:  etl
Etlbox
A lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Stars: ✭ 203 (-27.24%)
Mutual labels:  etl
Openkettlewebui
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (-55.2%)
Mutual labels:  etl
Riko
A Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+463.08%)
Mutual labels:  etl
Cql
Categorical Query Language IDE
Stars: ✭ 196 (-29.75%)
Mutual labels:  etl
1-60 of 333 similar projects