All Projects → geopython → Stetl

geopython / Stetl

Licence: gpl-3.0
Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Stetl

Metl
mito ETL tool
Stars: ✭ 153 (+139.06%)
Mutual labels:  pipeline, etl, etl-framework
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (+325%)
Mutual labels:  pipeline, etl, transformations
etl
M-Lab ingestion pipeline
Stars: ✭ 15 (-76.56%)
Mutual labels:  pipeline, etl
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-60.94%)
Mutual labels:  pipeline, etl
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+464.06%)
Mutual labels:  etl, etl-framework
redis-connect-dist
Real-Time Event Streaming & Change Data Capture
Stars: ✭ 21 (-67.19%)
Mutual labels:  etl, etl-framework
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-73.44%)
Mutual labels:  pipeline, etl
qwery
A SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-56.25%)
Mutual labels:  etl, etl-framework
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-62.5%)
Mutual labels:  etl, etl-framework
Koop
🔮 Transform, query, and download geospatial data on the web.
Stars: ✭ 505 (+689.06%)
Mutual labels:  etl, gis
Etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (+618.75%)
Mutual labels:  etl, etl-framework
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (+860.94%)
Mutual labels:  pipeline, etl
mydataharbor
🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (-56.25%)
Mutual labels:  pipeline, etl
DataBridge.NET
Configurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-75%)
Mutual labels:  etl, etl-framework
lineage
Generate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-75%)
Mutual labels:  pipeline, etl
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-67.19%)
Mutual labels:  etl, etl-framework
Phila Airflow
Stars: ✭ 16 (-75%)
Mutual labels:  pipeline, etl
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+856.25%)
Mutual labels:  etl, etl-framework
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-40.62%)
Mutual labels:  etl, etl-framework
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+481.25%)
Mutual labels:  etl, etl-framework

Stetl - Streaming ETL

Stetl, streaming ETL, pronounced "staedl", is a lightweight ETL-framework for geospatial data conversion.

Build Status Documentation Status Gitter Chat

Notice: the Stetl GH repo is now at the GeoPython GH organization.

License

Stetl is released under a GNU GPL v3 license (see LICENSE.txt).

Documentation

The Stetl website and documentation can be found via http://stetl.org. For a quick overview read the 5-minute Stetl-introduction, or a more detailed presentation. Stetl was presented at several events like the FOSS4G 2013 in Nottingham and GeoPython 2016.

Concepts

Stetl basically glues together existing parsing and transformation tools like GDAL/OGR, Jinja2 and XSLT with custom Python code. By using native libraries like libxml2 and libxslt (via Python lxml) Stetl is speed-optimized.

A configuration file, in Python config .ini format, specifies a chained sequence of transformation steps: typically an Input connected to one or more Filters, and finally to an Output. At runtime, this sequence is instantiated and run as a linked series of Python objects. These objects are symbolically specified (by their module/class name) and parameterized in the config file. Via the stetl -c <config file> command, the transformation is executed.

Stetl has been proven to handle 10's of millions of GML objects without any memory issues. This is achieved through a technique called "streaming and splitting". For example: using the OgrPostgisInput module an GML stream can be generated from the database. A component called the GmlSplitter can split this stream into manageable chunks (like 20000 features) and feed this upstream into the ETL chain.

Use Cases

Stetl has been found particularly useful for complex GML-related ETL-cases, like those found within EU INSPIRE Data Harmonization and the transformation of GML/XML-based National geo-datasets to for example PostGIS.

Most of the data conversions within the Dutch NLExtract Project apply Stetl.

Stetl also proved to be very effective in IoT-related transformations involving the SensorWeb/SOS.

Examples

Browse all examples under the examples dir. Best is to start with the basic examples

Installation

Stetl can be installed via PyPi pip install stetl and recently as a Stetl Docker image. More on installation in the documentation.

Contributing

Anyone and everyone is welcome to contribute. Please take a moment to review the guidelines for contributing.

Origins

Stetl originated in the INSPIRE-FOSS project: 2009-2013 now archived. Since then Stetl evolved into a wider use like transforming Dutch GML-based Open Datasets such as IMGEO/BGT (Large Scale Topography) and IMKAD/BRK (Cadastral Data) and Sensor Data Transformation and Calibration.

Finally

The word "stetl" is also an alternative writing for "shtetl": http://en.wikipedia.org/wiki/Stetl : "...Material things were neither disdained nor extremely praised in the shtetl. Learning and education were the ultimate measures of worth in the eyes of the community, while money was secondary to status..."

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].