All Projects → fitnr → Acris Download

fitnr / Acris Download

Licence: gpl-3.0
Download NYC real estate transaction data and drop it in a database

Labels

Projects that are alternatives of or similar to Acris Download

Device Sony Lilac
Stars: ✭ 34 (-10.53%)
Mutual labels:  makefile
Docker Bitcoin Regtest
A way to experiment with Bitcoin.
Stars: ✭ 35 (-7.89%)
Mutual labels:  makefile
Llvm Cookbook
llvm-cookbook samples
Stars: ✭ 37 (-2.63%)
Mutual labels:  makefile
Zh Google Styleguide
Google 开源项目风格指南 (中文版)
Stars: ✭ 8,315 (+21781.58%)
Mutual labels:  makefile
Espeak Korean
Korean data for eSpeak
Stars: ✭ 34 (-10.53%)
Mutual labels:  makefile
Coreos Nvidia
Yet another NVIDIA driver container for Container Linux (aka CoreOS)
Stars: ✭ 36 (-5.26%)
Mutual labels:  makefile
Golang Alpine Docker
Build golang binaries for alpine linux
Stars: ✭ 33 (-13.16%)
Mutual labels:  makefile
Ananas
This is an Arduino based program for step motor controller,Ananas.
Stars: ✭ 38 (+0%)
Mutual labels:  makefile
Android device xiaomi polaris
Stars: ✭ 35 (-7.89%)
Mutual labels:  makefile
I2c Slave Lib
I2C slave library for AVR 8 bit mircocontrollers
Stars: ✭ 36 (-5.26%)
Mutual labels:  makefile
Mal
mal - Make a Lisp
Stars: ✭ 8,287 (+21707.89%)
Mutual labels:  makefile
Rust Ffi Stringtools
A collection of examples how to use Rust libraries from other languages.
Stars: ✭ 34 (-10.53%)
Mutual labels:  makefile
Llvm Cheatsheet
LLVM, clang, ninja, dyld and others.
Stars: ✭ 36 (-5.26%)
Mutual labels:  makefile
Pantheios
The C/C++ Diagnostic Logging Sweetspot
Stars: ✭ 34 (-10.53%)
Mutual labels:  makefile
Syllabus
Syllabus for CompEcon Course
Stars: ✭ 37 (-2.63%)
Mutual labels:  makefile
Arduino Datalogging
Methods of Datalogging Sensor Data from an Arduino.
Stars: ✭ 33 (-13.16%)
Mutual labels:  makefile
Monero Gui Guide
Guide for the Monero GUI wallet
Stars: ✭ 36 (-5.26%)
Mutual labels:  makefile
Swift Graphql
GraphQL implementation written in Swift
Stars: ✭ 38 (+0%)
Mutual labels:  makefile
Docker Unix 1st Ed
A Docker image that drops you into 1st Edition Unix
Stars: ✭ 37 (-2.63%)
Mutual labels:  makefile
Avian Pack
Avian all-inclusive. Everything needed to build Avian with (or without) Android classpath.
Stars: ✭ 36 (-5.26%)
Mutual labels:  makefile

ACRIS Downloader

This Makefile downloads NYC property transfer data and optionally loads it into a database.

It's designed for people who know how to use databases, but don't necessarily want to slog through downloading huge files, manually setting up a schema and importing the those files.

Currently, SQLite, MySQL and PostGreSQL are supported. If you want to use other database software, you already probably know enough to customize the Makefile. It shouldn't be harder than changing a few flags.

the data

The ACRIS data set is big and complicated, see ACRIS Datasets below for some explanatory notes.

The Department of Finance supposedly updates the online records regularly, so you might use this Makefile, along with a cron job, to regularly update a mirror of their database.

Requirements

At least 10 GB of free disk space for the data and:

  • csvkit, a Python package
  • MySQL, SQLite or PostGreSQL

Installation

Docker

$ docker-compose run --rm setup

Go out for happy hour, this will take some time.

When it finishes, use either of the access methods below.

You can also set an env var ACRIS_DATASET, see docker-compose.yml for options, e.g.
$ ACRIS_DATASET=mysql_personal docker-compose run --rm setup

Direct DB access

$ docker-compose up db
$ docker-compose exec db mysql -uroot -ppass

PhpMyAdmin access

$ docker-compose up myadmin

In a browser got to http://localhost:8080 user: root, pass: pass

When finished with either access method, shut down:

$ docker-compose down

To reset the database, delete ./data/mysql/ To reset the downloads, delete ./data/downloads/

Local

Download (or git clone) this repository and open the folder in your terminal.

To install MySQL, start here.

To install csvkit, follow the instructions in the csvkit docs, or try one of these:

# If you have admin privileges
sudo make install

# If you don't have admin privileges. Might not work.
make install INSTALLFLAGS=--user

Downloading the data

Run the following command:

make

The data/ folder will slowly fill up with files. If you want to work directly with CSVs, you're done.

MySQL

Check that you have mysql up and running on your machine, and a user capable of creating databases. Don't use root!

make mysql USER=username PASS=mypass

(If you don't want to type your password in plaintext, you can leave off the PASS argument. You'll just have to enter your password many times.)

This will run the following tasks:

  • download the ACRIS real property datasets in CSV format (it will be slow)
  • dedupe the CSVs and reformat them slightly
  • generate schemas for the new MySQL tables
  • Create a new MySQL database (acris) and import the data into several tables
  • Add indices to sensible fields in each table. You may find it profitable to add more indices yourself.

If the downloads are interrupted, just run the command again. That's the power of make!

By default, only the real property datasets will be downloaded. To download and create tables for the personal property datasets:

make mysql_personal USER=myuser PASS=mypass

The ACRIS dataset also includes voluminous cross-reference and remarks files that aren't downloaded by default. To download them and load into MySQL:

make mysql_real_complete USER=mysqluser PASS=mysqlpass
make mysql_personal_complete USER=mysqluser PASS=mysqlpass

Using an existing database

If you want to add the data to tables in an existing database, run:

make DATABASE=mydb USER=myuser PASS=mypass

If you have other connection requirements:

make DATABASE=mydb USER=myuser PASS=mypass HOST=example.com MYSQLFLAGS="--port=123 --example-flag"

SQLite

This command will create acris.db, a database containing the real property datasets.

make sqlite

Download and load even more data into acris.db:

make sqlite_real_complete
make sqlite_personal_complete

PostGreSQL

make psql USER=username

Even more:

make psql_real_complete USER=username
make psql_personal_complete USER=username

Add custom connection paramaters:

make psql_real_complete USER=username PSQLFLAGS="--host=foo.com"

ACRIS Datasets

(The following is a reformatted version of a document published by NYC Department of Finance.)

ACRIS has two types of documents:

Real Property Records include documents in the Deeds and Other Conveyance, Mortgages & Instruments and other documents classes in ACRIS. These documents typically impact rights to real property and as such follow the real property rather than an individual.

Personal Property Records include documents in the UCC and Federal Liens class in ACRIS. These documents typically impact rights to personal property associated with real property and as such follow the individual party rather than the real property.

Each Real Property Record or Personal Property Record contains:

  • A master record
  • Zero or more lot(property) records
  • Zero or more party records
  • Zero or more cross-reference records
  • Zero or more remarks records

The Document ID in the master record is used to link all other record types to a master record. To find all of the lot (property) records associated with a master record, simply retrieve all records in the property dataset with the same “Document ID” as the selected master record. The same process should be repeated for Party, Cross Reference and Remark records.

Real property records

Personal property records

Code mappings

In ACRIS, documents are stored with codes representing longer descriptions that are displayed on images generated by ACRIS and in Document Search. The translation from these codes is done via the following code look up tables:

Example query

This example query selects all the transactions for a particular property in Brooklyn. Multiple joins are required to the real_property_parties table, as there are two (or more) parties per transaction.

SELECT
    streetnumber,
    streetname,
    documentid,
    c.description,
    m.recordtype,
    d.doctypedescription,
    docdate,
    docamount,
    d.party1type,
    p1.name party1name,
    d.party2type,
    p2.name party2name
FROM real_property_legals a
    LEFT JOIN real_property_master m USING (documentid)
    LEFT JOIN real_property_parties p1 USING (documentid)
    LEFT JOIN real_property_parties p2 USING (documentid)
    LEFT JOIN property_type_codes c USING (propertytype)
    LEFT JOIN document_control_codes d USING (doctype)
WHERE a.lot = 65
    AND a.borough = 3
    AND a.block = 429
    and p1.partytype = 1
    AND p2.partytype = 2;

Known issues

There's a bug in how csvkit <=0.9.1 handles fields that contain only the letter 'A' - they're converted into dates. This will break the recordtype column in certain tables.

License

General Public License version 3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].