All Projects → 173TECH → Sayn

173TECH / Sayn

Licence: apache-2.0
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sayn

Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+59.49%)
Mutual labels:  data-science, etl, data-engineering
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (+1297.47%)
Mutual labels:  data-science, sql, analytics
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+6126.58%)
Mutual labels:  data-science, etl, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+0%)
Mutual labels:  data-science, etl, data-engineering
Awesome Business Intelligence
Actively curated list of awesome BI tools. PRs welcome!
Stars: ✭ 1,157 (+1364.56%)
Mutual labels:  data-science, sql, etl
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+2918.99%)
Mutual labels:  data-science, etl, data-engineering
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-48.1%)
Mutual labels:  sql, analytics, etl
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+53867.09%)
Mutual labels:  data-science, analytics, data-engineering
Dagster
An orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+5088.61%)
Mutual labels:  data-science, analytics, etl
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+5698.73%)
Mutual labels:  data-science, sql, analytics
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+126.58%)
Mutual labels:  analytics, etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+701.27%)
Mutual labels:  data-science, etl, data-engineering
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (+121.52%)
Mutual labels:  data-science, sql, analytics
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-17.72%)
Mutual labels:  etl, analytics, data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+332.91%)
Mutual labels:  analytics, etl, data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+9970.89%)
Mutual labels:  automation, data-science, data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+993.67%)
Mutual labels:  data-science, data-engineering
Walkoff
A flexible, easy to use, automation framework allowing users to integrate their capabilities and devices to cut through the repetitive, tedious tasks slowing them down. #nsacyber
Stars: ✭ 855 (+982.28%)
Mutual labels:  automation, analytics
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-73.42%)
Mutual labels:  automation, etl
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+1110.13%)
Mutual labels:  sql, etl

SAYN logo

SAYN is a modern data processing and modelling framework. Users define tasks (incl. Python, automated SQL transformations and more) and their relationships, SAYN takes care of the rest. It is designed for simplicity, flexibility and centralisation in order to bring significant efficiency gains to the data engineering workflow.

Use Cases

SAYN can be used for multiple purposes across the data engineering and analytics workflows:

  • Data extraction: complement tools such as Fivetran or Stitch with customised extraction processes.
  • Data modelling: transform raw data in your data warehouse (e.g. aggregate activity or sessions, calculate marketing campaign ROI, etc.).
  • Data science: integrate and execute data science models.

Key Features

SAYN has the following key features:

  • YAML based DAG (Direct Acyclic Graph) creation. This means all analysts, including non Python proficient ones, can easily add tasks to ETL processes with SAYN.
  • Automated SQL transformations: write your SELECT statement. SAYN turns it into a table/view and manages everything for you.
  • Jinja parameters: switch easily between development and product environment and other tricks with Jinja templating.
  • Python tasks: use Python scripts to complement your extraction and loading layer and build data science models.
  • Multiple databases supported.
  • and much more... See the Documentation.

Design Principles

SAYN aims to empower data engineers and analysts through its three core design principles:

  • Simplicity: data processes should be easy to create, scale and maintain. So your team can focus on data transformation instead of writing processes. SAYN orchestrates all your tasks systematically and provides a lot of automation features.
  • Flexibility: the power of data is unlimited and so should your tooling. SAYN supports both SQL and Python so your analysts can choose the most optimal solution for each process.
  • Centralisation: all analytics code should live in one place, making your life easier and allowing dependencies throughout the whole analytics process.

Quick Start

$ pip install sayn
$ sayn init test_sayn
$ cd test_sayn
$ sayn run

This is it! You completed your first SAYN run on the example project. Continue with the Tutorial: Part 1 which will give you a good overview of SAYN's true power!

Release Updates

If you want to receive update emails about SAYN releases, you can sign up here.

Support

If you need any help with SAYN, or simply want to know more, please contact the team at [email protected].

License

SAYN is open source under the Apache 2.0 license.


Made with ❤️ by 173tech.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].