All Projects → airbytehq → Airbyte

airbytehq / Airbyte

Licence: mit
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

Programming Languages

java
68154 projects - #9 most used programming language
python
139335 projects - #7 most used programming language
typescript
32286 projects
shell
77523 projects
Dockerfile
14818 projects
Handlebars
879 projects

Projects that are alternatives of or similar to Airbyte

Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-98.39%)
Mutual labels:  data-science, data-analysis, pipeline, etl, data-engineering
Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (-62.57%)
Mutual labels:  pipeline, etl, data, data-integration
Datacleaner
The premier open source Data Quality solution
Stars: ✭ 391 (-92.05%)
Mutual labels:  data-science, data-analysis, etl, data
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-98.39%)
Mutual labels:  data-science, etl, data-engineering
Datacomparer
dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
Stars: ✭ 58 (-98.82%)
Mutual labels:  data-science, data-analysis, data
Graphia
A visualisation tool for the creation and analysis of graphs
Stars: ✭ 67 (-98.64%)
Mutual labels:  data-science, data-analysis, data
Gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Stars: ✭ 1,229 (-75.02%)
Mutual labels:  data-science, data-analysis, data
Flyte
Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (-74.75%)
Mutual labels:  data-science, data-analysis, data
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+766.72%)
Mutual labels:  data-science, data-analysis, data-engineering
Steppy
Lightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (-97.58%)
Mutual labels:  data-science, pipeline, open-source
Chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
Stars: ✭ 118 (-97.6%)
Mutual labels:  data-science, data-analysis, pipeline
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (-51.51%)
Mutual labels:  data-science, etl, data-engineering
Openrefine
OpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+73.43%)
Mutual labels:  data-science, data-analysis, data
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (-78.13%)
Mutual labels:  data-science, data-analysis, data
Awesome Business Intelligence
Actively curated list of awesome BI tools. PRs welcome!
Stars: ✭ 1,157 (-76.48%)
Mutual labels:  data-science, data-analysis, etl
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (-80.04%)
Mutual labels:  data-science, pipeline, pipelines
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-99.57%)
Mutual labels:  data-science, pipeline, open-source
Skdata
Python tools for data analysis
Stars: ✭ 16 (-99.67%)
Mutual labels:  data-science, data-analysis, data
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (-82.44%)
Mutual labels:  data-science, data-analysis, data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (-69.28%)
Mutual labels:  data-science, data, data-engineering

Introduction

GitHub stars GitHub Workflow Status License License

Data integration made simple, secure and extensible.
The new open-source standard to sync data from applications, APIs & databases to warehouses, lakes & other destinations.

Airbyte is on a mission to make data integration pipelines a commodity.

  • Maintenance-free connectors you can use in minutes. Just authenticate your sources and warehouse, and get connectors that adapt to schema and API changes for you.
  • Building new connectors made trivial. We make it very easy to add new connectors that you need, using the language of your choice, by offering scheduling and orchestration.
  • Designed to cover the long tail of connectors and needs. Benefit from the community's battle-tested connectors and adapt them to your specific needs.
  • Your data stays in your cloud. Have full control over your data, and the costs of your data transfers.
  • No more security compliance process to go through as Airbyte is self-hosted.
  • No more pricing indexed on volume, as cloud-based solutions offer.

Here's a list of our connectors with their health status.

Quick start

git clone https://github.com/airbytehq/airbyte.git
cd airbyte
docker-compose up

Now visit http://localhost:8000

Here is a step-by-step guide showing you how to load data from an API into a file, all on your computer.

Features

  • Built for extensibility: Adapt an existing connector to your needs or build a new one with ease.
  • Optional normalized schemas: Entirely customizable, start with raw data or from some suggestion of normalized data.
  • Full-grade scheduler: Automate your replications with the frequency you need.
  • Real-time monitoring: We log all errors in full detail to help you understand.
  • Incremental updates: Automated replications are based on incremental updates to reduce your data transfer costs.
  • Manual full refresh: Sometimes, you need to re-sync all your data to start again.
  • Debugging autonomy: Modify and debug pipelines as you see fit, without waiting.

See more on our website.

Contributing

We love contributions to Airbyte, big or small.

See our Contributing guide on how to get started. Not sure where to start? We’ve listed some good first issues to start with. If you have any questions, please open a draft PR or visit our slack channel where the core team can help answer your questions.

Note that you are able to create connectors using the language you want, as Airbyte connections run as Docker containers.

Also, we will never ask you to maintain your connector. The goal is that the Airbyte team and the community helps maintain it, let's call it crowdsourced maintenance!

Community support

For general help using Airbyte, please refer to the official Airbyte documentation. For additional help, you can use one of these channels to ask a question:

  • Slack (For live discussion with the Community and Airbyte team)
  • GitHub (Bug reports, Contributions)
  • Twitter (Get the news fast)
  • Weekly office hours (Live informal 30-minute video call sessions with the Airbyte team)

Roadmap

Check out our roadmap to get informed on what we are currently working on, and what we have in mind for the next weeks, months and years.

License

See the LICENSE file for licensing information, and our FAQ for any questions you may have on that topic.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].