All Projects → jreades → urb-studies-predicting-gentrification

jreades / urb-studies-predicting-gentrification

Licence: MIT License
This repo is intended to support replication and exploration of the analysis undertaken for our Urban Studies article "Understanding urban gentrification through Machine Learning: Predicting neighbourhood change in London".

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to urb-studies-predicting-gentrification

yggdrasil-decision-forests
A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.
Stars: ✭ 156 (+345.71%)
Mutual labels:  random-forest, ml
dashboard
Project for managing ML model and deploying ML module. It can deploy the Rekcurd service to Kubernetes cluster.
Stars: ✭ 27 (-22.86%)
Mutual labels:  ml
Keras-Application-Zoo
Reference implementations of popular DL models missing from keras-applications & keras-contrib
Stars: ✭ 31 (-11.43%)
Mutual labels:  ml
go-tensorflow
Tools and libraries for using Tensorflow (and Tensorflow Serving) in go
Stars: ✭ 25 (-28.57%)
Mutual labels:  ml
arboreto
A scalable python-based framework for gene regulatory network inference using tree-based ensemble regressors.
Stars: ✭ 33 (-5.71%)
Mutual labels:  random-forest
rafagas
Daily geospatial links curated by Raf Roset
Stars: ✭ 17 (-51.43%)
Mutual labels:  geography
PuzzleLib
Deep Learning framework with NVIDIA & AMD support
Stars: ✭ 52 (+48.57%)
Mutual labels:  ml
missRanger
R package "missRanger" for fast imputation of missing values by random forests.
Stars: ✭ 42 (+20%)
Mutual labels:  random-forest
forestError
A Unified Framework for Random Forest Prediction Error Estimation
Stars: ✭ 23 (-34.29%)
Mutual labels:  random-forest
pico-ml
A toy programming language which is a subset of OCaml.
Stars: ✭ 36 (+2.86%)
Mutual labels:  ml
odahu-flow
No description or website provided.
Stars: ✭ 12 (-65.71%)
Mutual labels:  ml
vision-camera-image-labeler
VisionCamera Frame Processor Plugin to label images using MLKit Vision
Stars: ✭ 62 (+77.14%)
Mutual labels:  ml
100-days-of-ai
人工智能 100 天
Stars: ✭ 14 (-60%)
Mutual labels:  ml
vs-mlrt
Efficient ML Filter Runtimes for VapourSynth (with built-in support for waifu2x, DPIR, RealESRGANv2, and Real-CUGAN)
Stars: ✭ 34 (-2.86%)
Mutual labels:  ml
vertex-ai-samples
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
Stars: ✭ 270 (+671.43%)
Mutual labels:  ml
deepchecks
Test Suites for Validating ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
Stars: ✭ 1,595 (+4457.14%)
Mutual labels:  ml
Amazon-Fine-Food-Review
Machine learning algorithm such as KNN,Naive Bayes,Logistic Regression,SVM,Decision Trees,Random Forest,k means and Truncated SVD on amazon fine food review
Stars: ✭ 28 (-20%)
Mutual labels:  random-forest
lab
A lightweight command line interface for the management of arbitrary machine learning tasks
Stars: ✭ 17 (-51.43%)
Mutual labels:  ml
MLDay18
Material from "Random Forests and Gradient Boosting Machines in R" presented at Machine Learning Day '18
Stars: ✭ 15 (-57.14%)
Mutual labels:  random-forest
CoreML-samples
Sample code for Core ML using ResNet50 provided by Apple and a custom model generated by coremltools.
Stars: ✭ 38 (+8.57%)
Mutual labels:  ml

Predicting Neighbourhood Change with Machine Learning

Important Note

I've now updated the setup.yml file to resolve some issues with setting up the virtual environment and also added a missing file to the Geodata download notebook. These have been tagged v1_1 but is also the current HEAD.

About this Repository

This repo is intended to support replication and exploration of the analysis undertaken for our Urban Studies article "Understanding urban gentrification through Machine Learning: Predicting neighbourhood change in London". Please reference the published version in any publications; however, if you do not have institutional access to Urban Studies then the pre-review draft of our paper can be accessed in our Institutional Repository.

Although we are not in a position to provide individualised support for installation or configuration of the iPython environment, we have attempted to make it as painless as possible for you to get up and running without hosing your existing Python environment. Please note that final visualisation of the results was undertaken in QGIS and R/RStudio, available for free for Mac, PC, and Linux (non-commercial use only for RStudio).

Re-Use & Citation

All code is available under the MIT License so you are free to modify it as you like; however, we would ask that, if you go on to use this work in a substantive way as a basis for further publications, you also cite the Urban Studies article and acknowledge our contribution in an Acknowledgements section.

Installation & Start-Up

You will need to install the Anaconda Python environment in order to run the setup script -- it should not matter if you install the full version or mini-conda so long as you have the conda toolset available to you. For some people the changes to the .bashrc/.bash_profile file made by Anaconda will cause problems elsewhere, so you are advised to check what effect (if any) the addition of the follow line has on any other tools upon which you rely:

export PATH="/anaconda3/bin:$PATH"

I, personally, use the following alias so that Anaconda is only available when I ask for it by typing conda-start:

alias conda-start='export PATH="/anaconda3/bin:$PATH"'

In particular I find this relevant for running QGIS 2.x on a Mac using the resources made available by KyngChaos. There seem to be more substantive 'issues' with QGIS 3 that mean you need to specify an environment variable in QGIS directly to get the right Python distribution, so this may no longer be a problem.

The recent update to PySAL means that it may now be easier to work with the [environment.yml] file dumped from conda than to install everything 'fresh' using the [setup.yml] script below. I've now added this to the repository. Installation for this is:

conda env create -f environment.yml

Installation instructions for the clean environment (which you will then need to debug in terms of library compatibilities) are also contained in the head of the YAML script, but are reproduced here for clarity:

conda-env create -f setup.yml
source activate mlgent
python -m ipykernel install --name mlgent --display-name "ML Gentrification"
jupyter lab

Obviously, if you see warning or error messages at any point you should stop and attempt to debug rather than mindlessly proceeding.

At this point you should have a new 'kernel' available in Jupyter called 'ML Gentrification' (or just mlgent if you skipped the naming command). The notebooks have all been set up to expect a kernel with this name and will prompt you to select a different kernel if you've opted to skip the environment creation step above.

Replication

Running the Notebooks

The notebooks have been named in such a way as to make it easy to work out the sequence of 'scripts' that need to be run: start with 01.. and finish with 08. Notebook 00 is only needed when you first clone the repo to ensure that the Geoconvert class is working. You'll notice that there are two versions of 08 (Neighbourhood Prediction); this is because I had timeout issues running 08 as a notebook and although the analysis would often complete at some point I had no way of knowing when, or if errors had arisen after the timeout occurred. Consequently, you might wish to run the 08 Python script instead as it will provide output directly to the terminal instead of indirectly via the server.

Geo-Convert

There are also several scripts to support testing of the class created to automate interaction with the UK Data Service's (originally MIMAS) Geo-Convert tool. This is essential for mapping Census data from the 2001 boundaries to the equivalent 2011 boundaries because of changes in Output Areas (OAs), Lower Super Output Areas (LSOAs), and to a lesser extent Middle Super Output Areas (MSOAs) between the two Censuses. If you are not working with UK Census data then this tool is not relevant (though boundary changes may still impact your results... you have been warned!). I should note that although it is possible, in principle, to update this class to perform any and all of the actions associated with the Geo-Convert web service I have only implemented the 2001 -> 2011 conversion of LSOAs as that is all that I needed.

From time to time the UK Data Service may also update Geo-Convert online forms (this has already happened to me once) and break the geoconvert class; up to a point I will attempt to correct this quickly, but if the changes are substantial enough then this may not be something I'm able to address immediately.

Downloading Data

Where possible I have attempted to either automate the download of the required data, or to make it available directly from the repo as downloaded from the NOMIS web site via their 'Query' service. In theory the NOMIS API should enable the automated selection and downloading of the data used by notebooks 2 and 3, but at the time that I was doing my work the API was broken and the documentation rather poor. As a result you will find all of the source data that cannot be downloaded in the data/src folder. In the interests of enabling a 'clean distribution' this is the only folder stored in GitHub; all others will be re-created as-needed when you run the notebooks.

Random Seeds

Many of the algorithms used in this analysis rely on randomness -- I have set the same seed everywhere that randomness might be used by Python and so your results should match mine. Be aware, however, that minor platform differences or other changes to the code could significaintly alter the results (though we obviously hope not!).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].