Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Chicago → Food Inspections Evaluation

Chicago / Food Inspections Evaluation

Licence: other

This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.

Labels

html data-science open-data open-science

Projects that are alternatives of or similar to Food Inspections Evaluation

Openml R

R package to interface with OpenML

Stars: ✭ 81 (-73.95%)

Mutual labels: open-data, data-science, open-science

git-rdm

A research data management plugin for the Git version control system.

Stars: ✭ 34 (-89.07%)

Mutual labels: open-data, open-science

Scihub

Source code and data analyses for the Sci-Hub Coverage Study

Stars: ✭ 205 (-34.08%)

Mutual labels: open-data, data-science

whyqd

data wrangling simplicity, complete audit transparency, and at speed

Stars: ✭ 16 (-94.86%)

Mutual labels: open-data, open-science

Codesearchnet

Datasets, tools, and benchmarks for representation learning of code.

Stars: ✭ 1,378 (+343.09%)

Mutual labels: open-data, data-science

Fma

FMA: A Dataset For Music Analysis

Stars: ✭ 1,391 (+347.27%)

Mutual labels: open-data, open-science

events

Materials related to events I might attend, and to talks I am giving

Stars: ✭ 22 (-92.93%)

Mutual labels: open-data, open-science

Skdata

Python tools for data analysis

Stars: ✭ 16 (-94.86%)

Mutual labels: open-data, data-science

linkedresearch.org

🌐 linkedresearch.org

Stars: ✭ 32 (-89.71%)

Mutual labels: open-data, open-science

Transform-to-Open-Science

Transformation to Open Science

Stars: ✭ 268 (-13.83%)

Mutual labels: open-data, open-science

awesome-utrecht-university

A curated list of awesome open source projects from Utrecht University.

Stars: ✭ 31 (-90.03%)

Mutual labels: open-data, open-science

site

Website for the Open Scholarship Strategy

Stars: ✭ 21 (-93.25%)

Mutual labels: open-data, open-science

Open Science Resources

A publicly-editable collection of open science resources, including tools, datasets, meta-resources, etc.

Stars: ✭ 58 (-81.35%)

Mutual labels: open-data, open-science

Electrophysiologydata

A list of openly available datasets in (mostly human) electrophysiology.

Stars: ✭ 143 (-54.02%)

Mutual labels: open-data, open-science

Doathon

Our discussion forum (see "issues") for the OpenCon Do-A-Thon, a day of trying, making, testing and doing to advance Open Research & Education. See our full website, with more information (including Github Help, and how to get involved).

Stars: ✭ 45 (-85.53%)

Mutual labels: open-data, open-science

datascience

Keeping track of activities around research data

Stars: ✭ 29 (-90.68%)

Mutual labels: open-data, open-science

Awesome Open Geoscience

Curated from repositories that make our lives as geoscientists, hackers and data wranglers easier or just more awesome

Stars: ✭ 668 (+114.79%)

Mutual labels: open-data, open-science

Querido Diario

📰 Brazilian government gazettes, accessible to everyone.

Stars: ✭ 681 (+118.97%)

Mutual labels: open-data, data-science

OSODOS

Open Science, Open Data, Open Source

Stars: ✭ 23 (-92.6%)

Mutual labels: open-data, open-science

Open-Data-Lab

an initiative to provide infrastructure for reproducible workflows around open data

Stars: ✭ 26 (-91.64%)

Mutual labels: open-data, open-science

View All Similar Projects ➔

Food Inspections Evaluation

This is our model for predicting which food establishments are at most risk for the types of violations most likely to spread food-borne illness. Chicago Department of Public Health staff use these predictions to prioritize inspections. During a two month pilot period, we found that that using these predictions meant that inspectors found critical violations much faster.

You can help improve the health of our city by improving this model. This repository contains a training and test set, along with the data used in the current model.

Feel free to clone, fork, send pull requests and to file bugs. Please note that we will need you to agree to our Contributor License Agreement (CLA) in order to be able to use any pull requests.

Original Analysis and Reports

In an effort to reduce the public’s exposure to foodborne illness the City of Chicago partnered with Allstate’s Quantitative Research & Analytics department to develop a predictive model to help prioritize the city's food inspection staff. This Github project is a complete working evaluation of the model including the data that was used in the model, the code that was used to produce the statistical results, the evaluation of the validity of the results, and documentation of our methodology.

The model evaluation calculates individualized risk scores for more than ten thousand Chicagoland food establishments using publically available data, most of which is updated nightly on Chicago’s data portal. The sole exception is information about the inspectors.

The evaluation compares two months of Chicago’s Department of Public Health inspections to an alternative data driven approach based on the model. The two month evaluation period is a completely out of sample evaluation based on a model created using test and training data sets from prior time periods.

The reports may be reproduced compiling the knitr documents present in ./REPORTS.

REQUIREMENTS

All of the code in this project uses the open source statistical application, R. We advise that you use R version >= 3.1 for best results.

Ubuntu users may need to install libssl-dev, libcurl4-gnutls-dev, and libxml2-dev. This can be accomplished by typing the following command at the command line: sudo apt-get install libssl-dev libcurl4-gnutls-dev libxml2-dev

The code makes extensive usage of the data.table package. If you are not familiar with the package, you might want to consult the data.table [FAQ available on CRAN] (http://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf).

FILE LAYOUT

The following directory structure is used:

DIRECTORY	DESCRIPTION
`.`	Project files such as README and LICENSE
`./CODE/`	Sequential scripts used to develop model
`./CODE/functions/`	General function definitions, which could be used in any script
`./DATA/`	Data files created by scripts in `./CODE/`, or static
`./REPORTS/`	Reports and other output are located in

We have included all of the steps used to develop the model, evaluate the results, and document the results in the above directory structure.

The scripts located in the ./CODE/ folder are organized sequentially, meaning that the numeric prefix indicates the order in which the script was / should be run in order to reproduce our results.

Although we include all the necessary steps to download and transform the data used in the model, we also have stored a snapshot of the data in the repository. So, to run the model as it stands, it is only necessary to download the repository, install the dependencies, and step through the code in CODE/30_glmnet_model.R. If you do not already have them, the dependencies can be installed using the startup script CODE/00_Startup.R.

DATA

Data used to develop the model is stored in the ./DATA directory. Chicago’s Open Data Portal. The following datasets were used in the building the analysis-ready dataset.

Business Licenses
Food Inspections 
Crime
Garbage Cart Complaints
Sanitation Complaints
Weather
Sanitarian Information

The data sources are joined to create a tabular dataset that paints a statistical picture of a ‘business license’- The primary modelling unit / unit of observation in this project.

The data sources are joined (in SQLesque manner) on appropriate composite keys. These keys include Inspection ID, Business License, and Geography expressed as a Latitude / Longitude combination among others.

Acknowledgements

This research was conducted by the City of Chicago with support from the Civic Consulting Alliance, and Allstate Insurance. The City would especially like to thank Stephen Collins, Gavin Smart, Ben Albright, and David Crippin for their efforts in developing the predictive model. We also appreciate the help of Kelsey Burr, Christian Hines, and Kiran Pookote in coordinating this research project. We owe a special thanks to our volunteers from Allstate who put in a tremendous effort to develop the predictive model and allowing their team to volunteer for projects to change their city. This project was partially funded by an award from the Bloomberg Philanthropies' Mayors Challenge.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 311

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗