All Projects → mggg → Maup

mggg / Maup

Licence: mit
The geospatial toolkit for redistricting data.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Maup

Centerline
Calculate the polygon's centerline
Stars: ✭ 94 (+168.57%)
Mutual labels:  gis, shapefile
Mapshaper
Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files
Stars: ✭ 2,813 (+7937.14%)
Mutual labels:  gis, shapefile
Gis Dataset Brasil
Geographic Information Systems (GIS) Dataset Brasil - Coleção de shapefiles, GeoJSON e TopoJSON prontas para uso
Stars: ✭ 121 (+245.71%)
Mutual labels:  gis, shapefile
Shapefile.jl
Parsing .shp files in Julia
Stars: ✭ 40 (+14.29%)
Mutual labels:  gis, shapefile
covid hospitals demographics
COVID-19 relevant data on hospital location / capacity, nursing home location / capacity, county demographics
Stars: ✭ 21 (-40%)
Mutual labels:  gis, shapefile
Pyearth
🌐 A lightweight 3D visualization of the earth in 150 lines of Qt/OpenGL
Stars: ✭ 78 (+122.86%)
Mutual labels:  gis, shapefile
Agstoshapefile
Convert ArcGIS Server Dynamic Map Service to GeoJSON and Shapefile
Stars: ✭ 172 (+391.43%)
Mutual labels:  gis, shapefile
Geotools
Official GeoTools repository
Stars: ✭ 1,109 (+3068.57%)
Mutual labels:  gis, shapefile
shapefile-rs
Rust library to read & write shapefiles
Stars: ✭ 38 (+8.57%)
Mutual labels:  gis, shapefile
pyGISS
📡 A lightweight GIS Software in less than 100 lines of code
Stars: ✭ 114 (+225.71%)
Mutual labels:  gis, shapefile
Aerialbot
A simple yet highly configurable bot that tweets geotagged aerial imagery of a random location in the world.
Stars: ✭ 157 (+348.57%)
Mutual labels:  gis, shapefile
Fmm
Fast map matching, an open source framework in C++
Stars: ✭ 359 (+925.71%)
Mutual labels:  gis, shapefile
gis-snippets
Some code snippets for GIS tasks
Stars: ✭ 45 (+28.57%)
Mutual labels:  gis, shapefile
GeoConvert
Converting between Geojson and GIS file formats
Stars: ✭ 32 (-8.57%)
Mutual labels:  gis, shapefile
Blendergis
Blender addons to make the bridge between Blender and geographic data
Stars: ✭ 4,642 (+13162.86%)
Mutual labels:  gis, shapefile
Django Rest Framework Gis
Geographic add-ons for Django REST Framework. Maintained by the OpenWISP Project.
Stars: ✭ 830 (+2271.43%)
Mutual labels:  gis
Spimedb
EXPLORE & EDIT REALITY
Stars: ✭ 14 (-60%)
Mutual labels:  gis
Agentmaps
Make social simulations on interactive maps with Javascript! Agent-based modeling for the web.
Stars: ✭ 822 (+2248.57%)
Mutual labels:  gis
Earthengine Py Notebooks
A collection of 360+ Jupyter Python notebook examples for using Google Earth Engine with interactive mapping
Stars: ✭ 807 (+2205.71%)
Mutual labels:  gis
Wildfirepy
WildfirePy, a Python library for Wildfire GIS data analysis.
Stars: ✭ 21 (-40%)
Mutual labels:  gis

maup

Build Status codecov PyPI conda-forge Package

maup is the geospatial toolkit for redistricting data. The package streamlines the basic workflows that arise when working with blocks, precincts, and districts, such as

The project's priorities are to be efficient by using spatial indices whenever possible and to integrate well with the existing ecosystem around pandas, geopandas and shapely. The package is distributed under the MIT License.

Installation

We recommend installing maup from conda-forge using conda:

conda install -c conda-forge maup

You can get conda by installing Miniconda, a free Python distribution made especially for data science and scientific computing. You might also consider Anaconda, which includes many data science packages that you might find useful.

To install maup from PyPI, run pip install maup from your terminal.

Examples

Here are some basic situations where you might find maup helpful. For these examples, we use test data from Providence, Rhode Island, which you can find in our Rhode Island shapefiles repo, or in the examples folder of this repo.

>>> import geopandas
>>> import pandas
>>>
>>> blocks = geopandas.read_file("zip://./examples/blocks.zip")
>>> precincts = geopandas.read_file("zip://./examples/precincts.zip")
>>> districts = geopandas.read_file("zip://./examples/districts.zip")

Assigning precincts to districts

The assign function in maup takes two sets of geometries called sources and targets and returns a pandas Series. The Series maps each geometry in sources to the geometry in targets that covers it. (Here, geometry A covers geometry B if every point of A and its boundary lies in B or its boundary.) If a source geometry is not covered by one single target geometry, it is assigned to the target geometry that covers the largest portion of its area.

>>> import maup
>>>
>>> assignment = maup.assign(precincts, districts)
>>> # Add the assigned districts as a column of the `precincts` GeoDataFrame:
>>> precincts["DISTRICT"] = assignment
>>> assignment.head()
0     7
1     5
2    13
3     6
4     1
dtype: int64

As an aside, you can use that assignment object to create a gerrychain Partition representing this districting plan.

Aggregating block data to precincts

Precinct shapefiles usually come with election data, but not demographic data. In order to study their demographics, we need to aggregate demographic data from census blocks up to the precinct level. We can do this by assigning blocks to precincts and then aggregating the data with a Pandas groupby operation:

>>> variables = ["TOTPOP", "NH_BLACK", "NH_WHITE"]
>>>
>>> assignment = maup.assign(blocks, precincts)
>>> precincts[variables] = blocks[variables].groupby(assignment).sum()
>>> precincts[variables].head()
   TOTPOP  NH_BLACK  NH_WHITE
0    5907       886       380
1    5636       924      1301
2    6549       584      4699
3    6009       435      1053
4    4962       156      3713

If you want to move data from one set of geometries to another but your source and target geometries do not nest neatly (i.e. have overlaps), see Prorating data when units do not nest neatly.

Disaggregating data from precincts down to blocks

It's common to have data at a coarser scale that you want to attach to finer-scaled geometries. Usually this happens when vote totals for a certain election are only reported at the county level, and we want to attach that data to precinct geometries.

Let's say we want to prorate the vote totals in the columns "PRES16D", "PRES16R" from our precincts GeoDataFrame down to our blocks GeoDataFrame. The first crucial step is to decide how we want to distribute a precinct's data to the blocks within it. Since we're prorating election data, it makes sense to use a block's total population or voting-age population. Here's how we might prorate by population ("TOTPOP"):

>>> election_columns = ["PRES16D", "PRES16R"]
>>> assignment = maup.assign(blocks, precincts)
>>>
>>> # We prorate the vote totals according to each block's share of the overall
>>> # precinct population:
>>> weights = blocks.TOTPOP / assignment.map(precincts.TOTPOP)
>>> prorated = maup.prorate(assignment, precincts[election_columns], weights)
>>>
>>> # Add the prorated vote totals as columns on the `blocks` GeoDataFrame:
>>> blocks[election_columns] = prorated
>>> # We'll call .round(2) to round the values for display purposes.
>>> blocks[election_columns].round(2).head()
   PRES16D  PRES16R
0     0.00     0.00
1    12.26     1.70
2    15.20     2.62
3    15.50     2.67
4     3.28     0.45

Warning about areal interpolation

We strongly urge you not to prorate by area! The area of a census block is not a good predictor of its population. In fact, the correlation goes in the other direction: larger census blocks are less populous than smaller ones.

Prorating data when units do not nest neatly

Suppose you have a shapefile of precincts with some election results data and you want to join that data onto a different, more recent precincts shapefile. The two sets of precincts will have overlaps, and will not nest neatly like the blocks and precincts did in the above examples. (Not that blocks and precincts always nest neatly...)

We can use maup.intersections to break the two sets of precincts into pieces that nest neatly into both sets. Then we can disaggregate from the old precincts onto these pieces, and aggregate up from the pieces to the new precincts. This move is a bit complicated, so maup provides a function called prorate that does just that.

We'll use our same blocks GeoDataFrame to estimate the populations of the pieces for the purposes of proration.

For our "new precincts" shapefile, we'll use the VTD shapefile for Rhode Island that the U.S. Census Bureau produced as part of their 2018 test run of for the 2020 Census.

>>> old_precincts = precincts
>>> new_precincts = geopandas.read_file("zip://./examples/new_precincts.zip")
>>>
>>> columns = ["SEN18D", "SEN18R"]
>>>
>>> # Include area_cutoff=0 to ignore any intersections with no area,
>>> # like boundary intersections, which we do not want to include in
>>> # our proration.
>>> pieces = maup.intersections(old_precincts, new_precincts, area_cutoff=0)
>>>
>>> # Weight by prorated population from blocks
>>> weights = blocks["TOTPOP"].groupby(maup.assign(blocks, pieces)).sum()
>>> # Normalize the weights so that votes are allocated according to their
>>> # share of population in the old_precincts
>>> weights = maup.normalize(weights, level=0)
>>>
>>> # Use blocks to estimate population of each piece
>>> new_precincts[columns] = maup.prorate(
...     pieces,
...     old_precincts[columns],
...     weights=weights
... )
>>> new_precincts[columns].head()
   SEN18D  SEN18R
0   752.0    51.0
1   370.0    21.0
2    97.0    17.0
3   585.0    74.0
4   246.0    20.0

Progress bars

For long-running operations, the user might want to see a progress bar to estimate how much longer a task will take (and whether to abandon it altogether).

maup provides an optional progress bar for this purpose. To temporarily activate a progress bar for a certain operation, use with maup.progress()::

>>> with maup.progress():
...     assignment = maup.assign(precincts, districts)
...

To turn on progress bars for all applicable operations (e.g. for an entire script), set maup.progress.enabled = True:

>>> maup.progress.enabled = True
>>> # Now a progress bar will display while this function runs:
>>> assignment = maup.assign(precincts, districts)
>>> # And this one too:
>>> pieces = maup.intersections(old_precincts, new_precincts, area_cutoff=0)

Modifiable areal unit problem

The name of this package comes from the modifiable areal unit problem (MAUP): the same spatial data will look different depending on how you divide up the space. Since maup is all about changing the way your data is aggregated and partitioned, we have named it after the MAUP to encourage users to use the toolkit thoughtfully and responsibly.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].