All Projects → CartoDB → carto-spatial-extension

CartoDB / carto-spatial-extension

Licence: other
A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift and Postgres with Spatial Analytics capabilities

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
PLpgSQL
1095 projects
Makefile
30231 projects
shell
77523 projects

Projects that are alternatives of or similar to carto-spatial-extension

dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-77.1%)
Mutual labels:  bigquery, snowflake, redshift
dbt-ml-preprocessing
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
Stars: ✭ 128 (-2.29%)
Mutual labels:  bigquery, snowflake, redshift
Tbls
tbls is a CI-Friendly tool for document a database, written in Go.
Stars: ✭ 940 (+617.56%)
Mutual labels:  bigquery, snowflake, redshift
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+1687.79%)
Mutual labels:  bigquery, snowflake, redshift
tellery
Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
Stars: ✭ 219 (+67.18%)
Mutual labels:  bigquery, snowflake, redshift
Sql Runner
Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
Stars: ✭ 68 (-48.09%)
Mutual labels:  bigquery, snowflake, redshift
starlake
Starlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (-87.79%)
Mutual labels:  bigquery, snowflake, redshift
geoscript-py
A Python GeoScript Implementation
Stars: ✭ 52 (-60.31%)
Mutual labels:  geospatial, gis
python-for-gis-progression-path
Progression path for a GIS analyst who wants to become proficient in using Python for GIS: from apprentice to guru
Stars: ✭ 98 (-25.19%)
Mutual labels:  geospatial, gis
kart
Distributed version-control for geospatial and tabular data
Stars: ✭ 253 (+93.13%)
Mutual labels:  geospatial, gis
tidyUSDA
An interface to USDA Quick Stats data with mapping capabilities.
Stars: ✭ 36 (-72.52%)
Mutual labels:  geospatial, gis
turf-go
A Go language port of Turf.js
Stars: ✭ 41 (-68.7%)
Mutual labels:  geospatial, gis
go-kml
Package kml provides convenience methods for creating and writing KML documents.
Stars: ✭ 67 (-48.85%)
Mutual labels:  geospatial, gis
whiteboxgui
An interactive GUI for WhiteboxTools in a Jupyter-based environment
Stars: ✭ 94 (-28.24%)
Mutual labels:  geospatial, gis
eodag
Earth Observation Data Access Gateway
Stars: ✭ 183 (+39.69%)
Mutual labels:  geospatial, gis
pyturf
A modular geospatial engine written in python
Stars: ✭ 15 (-88.55%)
Mutual labels:  geospatial, gis
leafmap
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
Stars: ✭ 1,299 (+891.6%)
Mutual labels:  geospatial, gis
georaster-layer-for-leaflet
Display GeoTIFFs and soon other types of raster on your Leaflet Map
Stars: ✭ 168 (+28.24%)
Mutual labels:  geospatial, gis
earthengine-py-examples
A collection of 300+ examples for using Earth Engine and the geemap Python package
Stars: ✭ 76 (-41.98%)
Mutual labels:  geospatial, gis
mapmint
Fast and easy webmapping.
Stars: ✭ 51 (-61.07%)
Mutual labels:  geospatial, gis

CARTO Analytics Toolbox Core

The CARTO Analytics Toolbox is a set of UDFs and Stored Procedures to unlock Spatial Analytics. It is organized into modules based on the functionality they offer. This toolbox is cloud-native, which means it is available for different data warehouses: BigQuery, Snowflake, and Redshift. It is built on top of the data warehouse's GIS features, extending and complementing this functionality.

This repo contains the core open-source modules of the toolbox. CARTO offers a set of premium modules that are available for CARTO users.

Documentation

Cloud Documentation
BigQuery https://docs.carto.com/analytics-toolbox-bigquery
Snowflake https://docs.carto.com/analytics-toolbox-snowflake
Redshift https://docs.carto.com/analytics-toolbox-redshift

Development

The repo contains the implementation of the toolbox for all the clouds. The functions are organized in modules. Each module has the following structure:

  • doc: contains the SQL reference of the functions
  • lib: contains the library code (JavaScript/Python)
  • sql: contains the function's code (SQL)
  • test: contains both the unit and integration tests

Inside a module, you can run the following commands. These commands are available for any {module}/{cloud}. For example, you can enter the module cd modules/accessors/bigquery and then run the command:

  • make help: shows the commands available in the Makefile.
  • make lint: runs a linter (using eslint or flake8).
  • make lint-fix: runs a linter (using eslint or flake8) and fixes the trivial issues.
  • make build: builds the bundles (using rollup or zip).
  • make deploy: builds the bundles and deploys and SQL functions to the data warehouse using the env variables.
  • make test-unit: builds the bundles and runs the unit tests for these bundles (using jest or pytest).
  • make test-integration: runs just the integration tests (using jest or pytest). It performs requests to real data warehouses.
  • make test-integration-full: runs the full-path integration tests (using jest or pytest). It is equivalent to deploy + test-integration + clean-deploy.
  • make clean: removes the installed dependencies and generated files.
  • make clean-deploy: removes all the assets, functions, procedures, tables uploaded in the deploy.

These commands can be used with all the modules at once from the root folder. For example, make deploy CLOUD=bigquery.

Additionally, this tool has been developed to generate code templates for new modules and functions.

BigQuery

The Analytics Toolbox for BigQuery contains SQL functions and JavaScript libraries. The functions are deployed in a dataset called carto inside a specific project. In BigQuery, datasets are associated with a region so the functions can only be used with tables stored in datasets with the same region. The JavaScript libraries are deployed in a Google Cloud Storage bucket and referenced by the functions.

Tools

Make sure you have installed the following tools:

Environment variables

The .env file contains the variables required to deploy and run the toolbox.

# BigQuery
BQ_PROJECT=your-bigquery-project
BQ_BUCKET=gs://your-gcs-bucket/
BQ_REGION=your-region
BQ_DATASET_PREFIX=
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service/account/or/adc.json

Note: you may need to run gcloud auth login to generate the acd.json file.

Snowflake

The Analytics Toolbox for Snowflake contains SQL functions and JavaScript libraries. The functions are deployed in a schema called carto inside a specific database. The JavaScript libraries are deployed inside the SQL functions. In Snowflake, the functions can be used with tables of any database in the same account.

Tools

Make sure you have installed the following tools:

Environment variables

The .env file contains the variables required to deploy and run the toolbox.

# Snowflake
SF_ACCOUNT=your-snowflake-account
SF_DATABASE=your-snowflake-database
SF_SCHEMA_PREFIX=
SF_USER=your-snowflake-user
SF_PASSWORD=your-snowflake-password
SF_SHARE_PREFIX=
SF_SHARE_ENABLED=0

Redshift

The Analytics Toolbox for Redshift contains SQL functions and Python libraries. The functions are deployed in a schema called carto inside a specific database. The Python libraries are installed in the Redshift cluster, so they can be used by all the databases in the cluster. An S3 bucket is required as intermediate storage to create the libraries. In Redshift, the functions can be used with tables of the same database, but different schemas.

Note: Redshift UDFs only support Python2 but the Python Redshift connector is only available in Python3. Therefore, both Python versions are required to develop the toolbox.

Tools

Make sure you have installed the following tools:

Environment variables

The .env file contains the variables required to deploy and run the toolbox.

# Redshift
RS_HOST=your-redshift-host-url.redshift.amazonaws.com
RS_CLUSTER_ID=your-redshift-cluster
RS_REGION=your-region
RS_BUCKET=s3://your-s3-bucket/
RS_DATABASE=your-redshift-database
RS_SCHEMA_PREFIX=
RS_USER=your-redshift-user
RS_PASSWORD=your-redshift-password
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-access-key

Contribute

This project is public. We are more than happy of receiving feedback and contributions. Feel free to open a ticket with a bug, a doubt or a discussion, or open a pull request with a fix or a new feature.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].