PantherDetect threats with log data and improve cloud security posture
Stars: ✭ 885 (+1635.29%)
openrefine-batchShell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+49.02%)
Jupyter RenderersRenderers and renderer extensions for JupyterLab
Stars: ✭ 395 (+674.51%)
pg-bifrostPostgreSQL Logical Replication tool into Kinesis, S3 and RabbitMQ
Stars: ✭ 31 (-39.22%)
PolybooljsBoolean operations on polygons (union, intersection, difference, xor)
Stars: ✭ 333 (+552.94%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (+145.1%)
AddaxAddax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+1105.88%)
Geodata BrArquivos Geojson com perímetros dos municípios brasileiros por estado ( Brasil / Brazil )
Stars: ✭ 307 (+501.96%)
terraform-aws-s3-bucketTerraform module that creates an S3 bucket with an optional IAM user for external CI/CD systems
Stars: ✭ 138 (+170.59%)
AlltheplacesA set of spiders and scrapers to extract location information from places that post their location on the internet.
Stars: ✭ 277 (+443.14%)
openrefine-dockerOpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Stars: ✭ 19 (-62.75%)
turf dartA turf.js-like geospatial analysis library working with GeoJSON, written in pure Dart.
Stars: ✭ 14 (-72.55%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+4576.47%)
etlM-Lab ingestion pipeline
Stars: ✭ 15 (-70.59%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (-62.75%)
geojson-bboxCalculates extent/bbox for a given valid geojson object.
Stars: ✭ 25 (-50.98%)
graphchain⚡️ An efficient cache for the execution of dask graphs.
Stars: ✭ 63 (+23.53%)
Mongo EsA MongoDB to Elasticsearch connector
Stars: ✭ 185 (+262.75%)
Dswarm Backoffice WebThe backoffice web application of d:swarm (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 11 (-78.43%)
iex-stocksETL for the IEX Stocks API
Stars: ✭ 19 (-62.75%)
countriesNowAPICountriesNow is an Open source API for retrieving geo-information for countries, including their states, cities, population, etc. 🌎
Stars: ✭ 78 (+52.94%)
TEAMThe Taxonomy for ETL Automation Metadata (TEAM) is a metadata management tool for data warehouse automation. It is part of the ecosystem for data warehouse automation, alongside the Virtual Data Warehouse pattern manager and the generic schema for Data Warehouse Automation.
Stars: ✭ 27 (-47.06%)
xyrQuery any data source using SQL, works with the local filesystem, s3, and more. It should be a very tiny and lightweight alternative to AWS Athena, Presto ... etc.
Stars: ✭ 58 (+13.73%)
TomboloDigitalConnectorThe Tombolo Digital Connector enables users to combine different sources of data in a transparent and reproducible way.
Stars: ✭ 56 (+9.8%)
kozaData transformation framework for LinkML data models
Stars: ✭ 21 (-58.82%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+127.45%)
dddplus-archetype-demo♨️ Using dddplus-archetype build a WMS in 5 minutes. 5分钟搭建一个仓储中台WMS!
Stars: ✭ 56 (+9.8%)
awesome-integrationA curated list of awesome system integration software and resources.
Stars: ✭ 117 (+129.41%)
Tuna🐟 A streaming ETL for fish
Stars: ✭ 11 (-78.43%)
neo4j-jdbcJDBC driver for Neo4j
Stars: ✭ 110 (+115.69%)
lineageGenerate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-68.63%)
Aws Ecs AirflowRun Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (+109.8%)
link-moveA model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
Stars: ✭ 32 (-37.25%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-66.67%)
s3-concatConcat multiple files in s3
Stars: ✭ 35 (-31.37%)
MetlMetl is a simple, web-based integration platform that allows for several different styles of data integration including messaging, file based Extract/Transform/Load (ETL), and remote procedure invocation via Web Services. Read more at www.jumpmind.com/products/metl/overview
Stars: ✭ 185 (+262.75%)
Bandar LogMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (-62.75%)
dtd2mysqlMySQL / MariaDB import for DTD feeds (fares, timetable and routeing)
Stars: ✭ 25 (-50.98%)
openrouteservice-docs📝 This repository stores the swagger specifications of the openrouteservice API. Browse to swagger for a detailed overview.
Stars: ✭ 59 (+15.69%)
Csv2dbThe CSV to database command line loader
Stars: ✭ 102 (+100%)
GpsPruneGpsPrune is a map-based application for viewing, editing and converting coordinate data from GPS systems.
Stars: ✭ 46 (-9.8%)
open-geo-data-educationOpen Geospatial Datasets for GIS Education: This is a repository of open geospatial datasets to be used in an educational context. I created these files over years of teaching Geographic Data Science and GIS. All original datasets are freely available online with open data licenses (see the dataset attribution for details). All the datasets in t…
Stars: ✭ 52 (+1.96%)
thainThain is a distributed flow schedule platform.
Stars: ✭ 81 (+58.82%)
wikirepoPython based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (-35.29%)
etl[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+447.06%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-58.82%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-60.78%)
cogj-specCloud Optimized GeoJSON spec
Stars: ✭ 36 (-29.41%)