All Projects → jsfenfen → covid_hospitals_demographics

jsfenfen / covid_hospitals_demographics

Licence: MIT License
COVID-19 relevant data on hospital location / capacity, nursing home location / capacity, county demographics

Programming Languages

HTML
75241 projects
Jupyter Notebook
11667 projects
Svelte
593 projects
python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
SCSS
7915 projects

Projects that are alternatives of or similar to covid hospitals demographics

Agstoshapefile
Convert ArcGIS Server Dynamic Map Service to GeoJSON and Shapefile
Stars: ✭ 172 (+719.05%)
Mutual labels:  gis, shapefile
GeoArrays.jl
Simple geographical raster interaction built on top of ArchGDAL, GDAL and CoordinateTransformations
Stars: ✭ 42 (+100%)
Mutual labels:  gis, spatial-data
Mapshaper
Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files
Stars: ✭ 2,813 (+13295.24%)
Mutual labels:  gis, shapefile
Centerline
Calculate the polygon's centerline
Stars: ✭ 94 (+347.62%)
Mutual labels:  gis, shapefile
pyGISS
📡 A lightweight GIS Software in less than 100 lines of code
Stars: ✭ 114 (+442.86%)
Mutual labels:  gis, shapefile
Gis Dataset Brasil
Geographic Information Systems (GIS) Dataset Brasil - Coleção de shapefiles, GeoJSON e TopoJSON prontas para uso
Stars: ✭ 121 (+476.19%)
Mutual labels:  gis, shapefile
QGIS-visualization-workshop
QGIS visualization workshop materials.
Stars: ✭ 46 (+119.05%)
Mutual labels:  gis, spatial-data
Maup
The geospatial toolkit for redistricting data.
Stars: ✭ 35 (+66.67%)
Mutual labels:  gis, shapefile
gis-snippets
Some code snippets for GIS tasks
Stars: ✭ 45 (+114.29%)
Mutual labels:  gis, shapefile
match4healthcare
Helping hospitals find qualified medicine students (Hackathon #wirvsvirus). A more flexibel version is currently in development under match4everyone/match4everything,
Stars: ✭ 16 (-23.81%)
Mutual labels:  hospital, covid-19
Pyearth
🌐 A lightweight 3D visualization of the earth in 150 lines of Qt/OpenGL
Stars: ✭ 78 (+271.43%)
Mutual labels:  gis, shapefile
shapefile-rs
Rust library to read & write shapefiles
Stars: ✭ 38 (+80.95%)
Mutual labels:  gis, shapefile
Geotools
Official GeoTools repository
Stars: ✭ 1,109 (+5180.95%)
Mutual labels:  gis, shapefile
Aerialbot
A simple yet highly configurable bot that tweets geotagged aerial imagery of a random location in the world.
Stars: ✭ 157 (+647.62%)
Mutual labels:  gis, shapefile
Shapefile.jl
Parsing .shp files in Julia
Stars: ✭ 40 (+90.48%)
Mutual labels:  gis, shapefile
kdtree
A pure Nim k-d tree implementation for efficient spatial querying of point data
Stars: ✭ 40 (+90.48%)
Mutual labels:  gis, spatial-data
Fmm
Fast map matching, an open source framework in C++
Stars: ✭ 359 (+1609.52%)
Mutual labels:  gis, shapefile
Blendergis
Blender addons to make the bridge between Blender and geographic data
Stars: ✭ 4,642 (+22004.76%)
Mutual labels:  gis, shapefile
rsgislib
Remote Sensing and GIS Software Library; python module tools for processing spatial data.
Stars: ✭ 103 (+390.48%)
Mutual labels:  gis, spatial-data
covid19 scenarios data
Data preprocessing scripts and preprocessed data storage for COVID-19 Scenarios project
Stars: ✭ 43 (+104.76%)
Mutual labels:  hospital, covid-19

Hospital, population, nursing center data

This repository is a project to present and join datasets pertinent to the COVID-19 pandemic at the county level: hospital location and capacity, nursing home location and capacity, and county-level population estimates by age.

A simplified state-level view of this with only population breakouts for 65+ is available here. The source data for statewide hospital beds is here.

In general this repo is trying to follow the datakit repo convention, it isn't actually a datakit repo but may become one at some point. Source data is in /data/source/ and processed results are in /data/processed/.

Data

The main output files are described below. In general the output files are in /data/processed/.

April 3 update With help from CMS we've figured out how to get the names of odd-numbered medical units, so the 0X99 lines are now broken out into adult and infped (meaning infant or pediatric) beds. The determination of which bed units are which is visible at a hospital-by-hospital level in extra_line_units.csv

Hospital-level bed data

CSV: hospital_data.csv Shapefile (with a subset of columns) as hosp_geo_final.zip

Fast answer: COVID ready ICU beds?

What's a COVID -ready ICU bed? We can't really say. Hospitals are full of creative brilliant folks who are hard at work ramping up capacity to care for patients with COVID. This data is to help understand the health system's prior baseline operations, as reported to CMS.

One way of thinking about this could be all available emergency beds, roughly arrived at by subtotal_acute_beds_1400 - acute_beds_0700 Again, we'd caution users not to assume too much, as hospitals have difficult decisions to make, room by room, in regards to separating out COVID patients, and maintaining the capability to treat non-COVID injuries. It may be better to think of a systemwide response rather than focusing too much on any particular hospital. The medical system is complex and shortages of PPE, ventilators, or staff to operate it may all play a larger role than beds.

Because the lines effected do not directly imply payment, they are possibly more prone to error. If you are concerned with a figure in a particular cost report, consider consulting reports from earlier periods. A file of all reports with extracted data from 2016 through 2019 is available here.

Background

The hospital bed counts data come from the raw CMS cost reports database here. They are upwards of ~600MB unzipped, so aren't included in this repo. A simple introduction to how they are structured is here.

They don't have header rows, if you want to process them you have to add your own.

For the NMRC file I used

RPT_REC_NUM,WKSHT_CD,LINE_NUM,CLMN_NUM,ITM_VAL_NUM

For the RPT file I used

RPT_REC_NUM,PRVDR_CTRL_TYPE_CD,PRVDR_NUM,Unknown,RPT_STUS_CD,FY_BGN_DATE,FY_END_DATE,PROC_DT,INITL_RPT_SW,LAST_RPT_SW,TRNSMTL_NUM,FI_NUM,ADR_VNDR_CD,FI_CREAT_DT,UTIL_CD,NPR_DT,SPEC_IND,FI_RCPT_DT

These files have basic hospital information and bed counts from the most recently filed hospital cost report received in 2017 or later. The source report number, fiscal year end date, and filing date is also included. These come from page 9 column 2 of this original form from 2017.

The documentation is a little hard to follow, see the instructions for completing this form on p. 62 here. It refers to 42 CFR 412.105(b) which may be relevant. It also cites 69 FR 49093-49098 (August 11, 2004). In general more documentation for the cost reports is here.

There's an awesome python notebook written by Erin Petenko that makes a little clearer how to navigate this data.

Each of the hospital bed lines corresponds directly to a line in the column 2 of worksheet S-3 (except for the ICU total).

A numeric breakdown of the minor lines, which do not end in '00' and are summed in their respective ext_NN99 variables, are available here.

Major lines versus subscripted lines

CMS documentation describes major lines in Worksheet 3 part 1; they end in '00'. For instance, ICU beds in theory are given by line '00800'. However, CMS appears to tolerate "subscripted" lines with values other than '00800'. There's no standard as to what these mean; some hospitals might use 801 to mean neonatal intensive care unit beds, whereas others might use it to mean pediatric intensive care unit beds. If you're interested in what each nonstandard unit means, these are listed in the extra_line_units.csv with the unit name given in the unit_name column. There's also a unit_type column that is our best guess of whether a unit is an Infant/Neonatal bed (listed as "NEO") or a pediatric bed ("PED").

To simplify summing beds, we've summed all additional units into 00N99 rows broken out by adult units and pediatric or infant units. In other words, 00800 is listed as icu_beds_800 but any other adult units matching 008\d\d will be summed into extra_beds_0899_adult and any infant or pediatric units will be summed into extra_beds_0899_infped.

Bed utilization

CMS requires hospitals to report overall bed utilization in the form of days for the same lines as beds. The same format of summation is used: 00800 for icu beds listed as 00800; and 00899 for the sum of 00801, 00802, 00803, etc.

Bed utilization is given for all_adult_icu_beds and subtotal_acute_beds. It's a percentage of days_in_period that the beds were full for each reported line.

Observation bed days are not used in utilization calculations.

CMS line numbers to column names

Here are the bed numbers used, the variable names appear in bold. In general, appended _XXXX means that the variable appears on line XXXX in the original report, except for XX99 lines, which are summation of "other" values accepted for this bed type.

  • acute_beds_0700 All Adult/Pediatric Acute Care Beds

  • icu_beds_0800 Intensive Care Beds

  • extra_0899 All other intensive care beds, including lines '00801','00802','00803','00804','00805','00806','00807','00808','00810','00820','00830','00850'

  • coronary_beds_0900 Coronary Care Beds

  • extra_0999 All other Coronary Care Beds including '00901','00902','00903

  • burn_beds_1000 Burn Intensive Care Units

  • extra_1099 Other Burn Intensive Care Units including '01001','01002','01003','01004'

  • surg_icu_beds_01100 Surgical ICU Beds

  • extra_1199 Other surgical ICU beds including '01101','01102','01103','01104','01105','01106','01107','01110'

  • oth_spec_beds_1200 Other Specialty Beds

  • 'extra_1299 Other specialty units including '01201', '01202','01203','01204','01205','01206','01210'

The sum of the above lines is given by:

  • subtotal_acute_beds_1400 Subtotal of acute care beds 01400

To make working with ICU data easier there's also a column of all 08XX lines:

  • all_icu_beds A summation of icu_beds_0800 and extra_0899. This is used to calculate total ICU utilization. Be skeptical of ICU bed numbers that rely heavily on units described in extra_0899, often a fraction of these are intended for children or infants and won't be useful for adults.

Additional hospital bed types (not acute care beds)

  • subprovider_ipf_beds_1600 Subprovider Inpatient Psychiatric beds
  • subprovider_irf_beds_1700 Subprovider Inpatient Rehabilitation beds
  • subprovider_oth_beds_1800 Subprovider Inpatitent Other beds
  • skilled_nursing_beds_1900 Skilled nursing beds
  • nursing_fac_beds_2000 Nursing Facility beds
  • oth_longterm_beds_2100 Other Longterm beds
  • hospice_beds_2400 Hospice beds

The sum of subtotal_acute_beds and all additional hospital bed types is given by

  • all_beds_2700 All Beds

  • labor_delivery_beds_3200 Labor and Delivery Beds. These are not included in 02700 "All Beds," per CMS rules.

Military hospitals with an id ending in F are missing bed counts but are included here anyways. Many children's hospitals (e.g. hospital_type = childrens) do not report bed counts. Psychiatric hospitals are not included. Recently opened facilities that have not filed CMS reports yet also show zero bed counts.

(The shapefile leaves out one hospital in Puerto Rico.)

The hospital's provider number should correspond to the provider number in the next file.

Bed days

For each of the bed types described above, there is also a corresponding _bed_days variable. E.g. icu_bed_days_0800 corresponds to icu_beds_0800 and extra_days_0899 corresponds to the number of beds given in extra_0899

Bed utilization

Utilization rate, as a percent, is calculated for all_icu_beds and for subtotal_acute_beds. To estimate the spare ICU bed capacity, you could use: all_icu_beds * all_icu_utilization / 100

Employment

The total number of residents/interns and overall payroll employees, taken from line 27, is listed as well.

Additional data files

If you are curious as to what the other bed units used by each hospital are, you can look at the extra line units file. It gives the actual line number used by the hospitals in LINE_NUM. These line designations do not have a consistent meaning--one hospital may use 801 to refer to pediatric ICU beds while another may use it to refer to neonatal ICU beds.

Matching non-standard lines to column labels.

The cost unit number associated with a collection of hospital beds is given in column number 100. Using the example of cost report 648741, which lists nonstandard beds in line 1201, we can determine the cost unit numnber with a query like this

select "ITM_VAL_NUM" from cost_reports_nmrc where "WKSHT_CD" = 'S300001' and "RPT_REC_NUM" = 648741 and "LINE_NUM" = '01201' and "CLMN_NUM" = '00100' limit 100;

ITM_VAL_NUM
-------------
35.01

Converting this value to 03501, we can then look in "worksheet" A00000 in the "alphanumeric" public release file with column number '00000' and plug the unit numnber from before in.

 select "ITM_VAL_NUM" from cost_reports_alpha where "RPT_REC_NUM" = 648741 and "WKSHT_CD" = 'A000000' and "CLMN_NUM" = '00000' and "LINE_NUM" = '03501';
ITM_VAL_NUM
------------------------------------
02080PEDIATRIC INTENSIVE CARE UNIT 

Dropping the first five digits, we know these beds are associated with the hospitals "PEDIATRIC INTENSIVE CARE UNIT"

Census data by county by age

Downloadable csv file ; shapefile

County-level population age data comes from the Annual Estimates of the Resident Population for Selected Age Groups by Sex for the United States, States: April 1, 2010 to July 1, 2018 from the 2018 Population Estimates.

Population estimates are given in 5-year age ranges, e.g. 70-74.

Geocoded nursing home locations

CSV File: nh_gen_info_geocoded_final.csv

Source: CMS' nursing home compare data. It contains the number of certified beds and average daily occupancy among other variables. Lat and lngs were added where they were missing with google's geocoder; where this occurred the column geocode_flag = 1. The geocode_accuracy field uses google's terminology.

Full documentation of the source file is here

Stories

If you're able to use this in your work, or have relevant data to add, please let us know.

Portland Tribune, 3/14 "As number of virus cases grows, Oregon has lowest hospital bed rate in U.S.".

Monterey Weekly, 3/25 "Long before coronavirus, local nursing homes were struggling with infection control rules"

Industry Dive, 3/30 "How hospital capacity varies dramatically across the country

Suggested reading:

"COVID-19 story recipe: Analyzing nursing home data for infection-control problems", Source, Mike Stucka, 3/16/20

Contributors

Jacob Fenton, PublicAccountability.org; Erin Petenko VTDigger; Justin Mayo, Big Local News. Additional resources are available on the Big Local News platform.

License and Attribution

This data is freely available for public use.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].