All Projects â†’ saturncloud â†’ prefect-saturn

saturncloud / prefect-saturn

Licence: BSD-3-Clause license
Python client for using Prefect Cloud with Saturn Cloud

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to prefect-saturn

Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+52940%)
Mutual labels:  workflow-engine, data-engineering, prefect
funsies
funsies is a lightweight workflow engine 🔧
Stars: ✭ 37 (+146.67%)
Mutual labels:  workflow-engine, data-engineering
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+286.67%)
Mutual labels:  data-engineering
daskperiment
Reproducibility for Humans: A lightweight tool to perform reproducible machine learning experiment.
Stars: ✭ 25 (+66.67%)
Mutual labels:  dask
Everything-Tech
A collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+2540%)
Mutual labels:  data-engineering
lrmr
Less-Resilient MapReduce framework for Go
Stars: ✭ 32 (+113.33%)
Mutual labels:  data-engineering
contessa
Easy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (+13.33%)
Mutual labels:  data-engineering
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+2820%)
Mutual labels:  data-engineering
deordie-meetups
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (+220%)
Mutual labels:  data-engineering
papilo
DEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (+60%)
Mutual labels:  data-engineering
micronaut-camunda-bpm
Integration between Micronaut and Camunda (Workflow Engine). We configure Camunda with sensible defaults, so that you can get started with minimum configuration: simply add a dependency in your Micronaut project to embed the workflow engine!
Stars: ✭ 73 (+386.67%)
Mutual labels:  workflow-engine
xarray-beam
Distributed Xarray with Apache Beam
Stars: ✭ 83 (+453.33%)
Mutual labels:  dask
Quickflow
Workflow engine in C# .NET Core
Stars: ✭ 22 (+46.67%)
Mutual labels:  workflow-engine
get smarties
Dummy variable generation with fit/transform capabilities
Stars: ✭ 23 (+53.33%)
Mutual labels:  data-engineering
qhub
ðŸŠī Nebari - your open source data science platform
Stars: ✭ 175 (+1066.67%)
Mutual labels:  dask
postier
Postier is a Laravel API automation platform to transfer data and to sync apps. You can build workflows with data and actions of multiple apps and apply logics to the data!
Stars: ✭ 55 (+266.67%)
Mutual labels:  workflow-engine
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+1760%)
Mutual labels:  data-engineering
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (+73.33%)
Mutual labels:  data-engineering
dask-sql
Distributed SQL Engine in Python using Dask
Stars: ✭ 271 (+1706.67%)
Mutual labels:  dask
awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
Stars: ✭ 325 (+2066.67%)
Mutual labels:  data-engineering

prefect-saturn

GitHub Actions PyPI Version

prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tutorial, see "Fault-Tolerant Data Pipelines with Prefect Cloud ".

Installation

prefect-saturn is available on PyPi.

pip install prefect-saturn

prefect-saturn can be installed directly from GitHub

pip install git+https://github.com/saturncloud/prefect-saturn.git@main

Getting Started

prefect-saturn is intended for use inside a Saturn Cloud environment, such as a Jupyter notebook.

import prefect
from prefect import Flow, task
from prefect_saturn import PrefectCloudIntegration


@task
def hello_task():
    logger = prefect.context.get("logger")
    logger.info("hello prefect-saturn")


flow = Flow("sample-flow", tasks=[hello_task])

project_name = "sample-project"
integration = PrefectCloudIntegration(
    prefect_cloud_project_name=project_name
)
flow = integration.register_flow_with_saturn(flow)

flow.register(
    project_name=project_name,
    labels=["saturn-cloud"]
)

Customize Dask

You can customize the size and behavior of the Dask cluster used to run prefect flows. prefect_saturn.PrefectCloudIntegration.register_flow_with_saturn() accepts to arguments to accomplish this:

For example, the code below tells Saturn that this flow should run on a Dask cluster with 3 xlarge workers, and that prefect should shut down the cluster once the flow run has finished.

flow = integration.register_flow_with_saturn(
    flow=flow,
    dask_cluster_kwargs={
        "n_workers": 3,
        "worker_size": "xlarge",
        "autoclose": True
    }
)

flow.register(
    project_name=project_name,
    labels=["saturn-cloud"]
)

Contributing

See CONTRIBUTING.md for documentation on how to test and contribute to prefect-saturn.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].