All Projects → Chicago → Open Data Etl Utility Kit

Chicago / Open Data Etl Utility Kit

Licence: other
Use Pentaho's open source data integration tool (Kettle) to create Extract-Transform-Load (ETL) processes to update a Socrata open data portal. Documentation is available at http://open-data-etl-utility-kit.readthedocs.io/en/stable

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to Open Data Etl Utility Kit

Osd Bike Routes
Open source release of bike routes in Chicago.
Stars: ✭ 140 (+50.54%)
Mutual labels:  open-data, government
Rsocrata
Provides easier interaction with Socrata open data portals http://dev.socrata.com. Users can provide a 'Socrata' data set resource URL, or a 'Socrata' Open Data API (SoDA) web query, or a 'Socrata' "human-friendly" URL, returns an R data frame. Converts dates to 'POSIX' format. Manages throttling by 'Socrata'.
Stars: ✭ 182 (+95.7%)
Mutual labels:  open-data, government
Osd Street Center Line
Open source release of street center lines in Chicago.
Stars: ✭ 108 (+16.13%)
Mutual labels:  open-data, government
CityScoreToolkit
Open-source version of Boston's CityScore performance dashboard
Stars: ✭ 42 (-54.84%)
Mutual labels:  government, open-data
311
New web portal for BOS:311
Stars: ✭ 15 (-83.87%)
Mutual labels:  government, open-data
Atd Data And Performance
Open data and performance hub for the City of Austin Transportation Department
Stars: ✭ 17 (-81.72%)
Mutual labels:  open-data, government
Data.gov
Data.gov source code and issue tracker
Stars: ✭ 1,856 (+1895.7%)
Mutual labels:  open-data, government
Forms
Tracking our progress moving all city paper and pdf forms online.
Stars: ✭ 14 (-84.95%)
Mutual labels:  government, open-data
osd-building-footprints
Open source release of building footprints in Chicago.
Stars: ✭ 61 (-34.41%)
Mutual labels:  government, open-data
Decidim
The participatory democracy framework. A generator and multiple gems made with Ruby on Rails
Stars: ✭ 894 (+861.29%)
Mutual labels:  open-data, government
Openpolice Platform
An open source web publishing platform for police forces.
Stars: ✭ 15 (-83.87%)
Mutual labels:  open-data, government
Code.mil
An experiment in open source at the Department of Defense.
Stars: ✭ 1,242 (+1235.48%)
Mutual labels:  government
Data Story
A visual process builder for Laravel
Stars: ✭ 71 (-23.66%)
Mutual labels:  etl
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1184.95%)
Mutual labels:  etl
Visual Town Budget
Open-source budget visualization framework.
Stars: ✭ 74 (-20.43%)
Mutual labels:  government
Opendefinition
Open Definition source
Stars: ✭ 87 (-6.45%)
Mutual labels:  open-data
Openml R
R package to interface with OpenML
Stars: ✭ 81 (-12.9%)
Mutual labels:  open-data
Pittapi
An API to easily get data from the University of Pittsburgh
Stars: ✭ 74 (-20.43%)
Mutual labels:  open-data
Locopy
locopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-21.51%)
Mutual labels:  etl
Kamu Cli
Next generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-25.81%)
Mutual labels:  open-data

ETL Utilities for an Open Data Program

This toolkit provides several utilities and framework to help governments deploy automated ETLs using the open-source Pentaho data integration (Kettle) software.

Namely, this toolkit will allow:

  • Loading data from a database and upload it to a Socrata data portal
  • Integrates with an SMTP server to provide e-mail alerts on the outcome of ETL scripts to administrators
  • Handles deployment issues when using multiple operating systems during development
  • Utilities to allow administrators to quickly analyze the log files of ETLs for quick diagnostics

The ETL framework is organized so each function can be modified in one file that is used by all ETLs. This provides for easier maintenance, upgrading, and modification over hundreds of ETLs.

Features

  • Open source at the core - this framework can be deployed using Kettle, an open-source ETL software. With an annual support subscription, Pentaho also provides telephone support and training if desired.
  • Compatible with multiple data sources - this ETL framework can be used with a variety of data sources, including a range of databases (MySQL, PostgreSQL, Oracle, SQL Server, and variety of NoSQL), APIs, text files, etc.
  • Compatible workflow for multiple operating systems - ETLs can be developed and deployed across multiple operating systems. ETLs can be developed on a Windows environment and deployed on Linux
  • Helpful utilities - includes several scripts to help users quickly analyze log files

Requirements

The requirements for the recommended configuration require the following pieces of software:

  • Kettle (or Pentaho) data integration - Note: This framework has only been tested with Kettle 4.4.0 and lower.
  • Java 1.6 or higher
  • DataSync (for use with Socrata) - Note: This framework is designed for the version of DataSync in the DataSync directory and will not necessarily work with earlier or later versions.
  • MacOS X, Linux, or Unix (only required for full automation with included scripts)

Kettle Compatibility

This framework has only been tested using Kettle 4.3.0 and Kettle 4.4.0. It is possible that this framework is fully compatible with Kettle 5.x, but has not been tested. If you would like to contribute, please see the issue page.

Errors / Bugs

Experiencing issues with the included files? Report it on our issue tracker

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].