All Projects → aaowens → Psid.jl

aaowens / Psid.jl

Licence: mit
Quickly assemble data from the Panel Study of Income Dynamics (PSID)

Programming Languages

julia
2034 projects

Labels

Projects that are alternatives of or similar to Psid.jl

DichotomyTests
Dichotomy Tests provides interesting tests that show you how inclined you are toward certain philosophical, psychological or political views. Each test tries to discover your preferred beliefs and will attempt to accurately position you along numerous dichotomic axes.
Stars: ✭ 78 (+254.55%)
Mutual labels:  economics
computational-economy
An agent-based computational economy with macroeconomic equilibria from microeconomic behaviors
Stars: ✭ 67 (+204.55%)
Mutual labels:  economics
Ai Economist
Foundation is a flexible, modular, and composable framework to model socio-economic behaviors and dynamics with both agents and governments. This framework can be used in conjunction with reinforcement learning to learn optimal economic policies, as done by the AI Economist (https://www.einstein.ai/the-ai-economist).
Stars: ✭ 537 (+2340.91%)
Mutual labels:  economics
SolveDSGE.jl
A Julia package to solve DSGE models
Stars: ✭ 55 (+150%)
Mutual labels:  economics
SMC.jl
Sequential Monte Carlo algorithm for approximation of posterior distributions.
Stars: ✭ 53 (+140.91%)
Mutual labels:  economics
acclimate
Acclimate - an agent-based model for economic loss propagation
Stars: ✭ 17 (-22.73%)
Mutual labels:  economics
quantgov
The QuantGov Library
Stars: ✭ 21 (-4.55%)
Mutual labels:  economics
Fecon235
Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio equities SPX bonds TIPS rates currency FX euro EUR USD JPY yen XAU gold Brent WTI oil Holt-Winters time-series forecasting statistics econometrics
Stars: ✭ 708 (+3118.18%)
Mutual labels:  economics
ose-course-scientific-computing
course on the basics of scientific computing for economists
Stars: ✭ 21 (-4.55%)
Mutual labels:  economics
Mostly Harmless Replication
Replication of tables and figures from "Mostly Harmless Econometrics" in Stata, R, Python and Julia.
Stars: ✭ 337 (+1431.82%)
Mutual labels:  economics
occupationcoder
Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
Stars: ✭ 30 (+36.36%)
Mutual labels:  economics
ESL
​The Economic Simulation Library provides an extensive collection of tools to develop, test, analyse and calibrate economic and financial agent-based models. The library is designed to take advantage of different computer architectures. In order to facilitate rapid iteration during model development the library can use parallel computation. Econ…
Stars: ✭ 36 (+63.64%)
Mutual labels:  economics
0xdeca10b
Sharing Updatable Models (SUM) on Blockchain
Stars: ✭ 285 (+1195.45%)
Mutual labels:  economics
econ5170
Econ5170@CUHK: Computational Methods in Economics (2020 Spring).
Stars: ✭ 127 (+477.27%)
Mutual labels:  economics
Dsge.jl
Solve and estimate Dynamic Stochastic General Equilibrium models (including the New York Fed DSGE)
Stars: ✭ 625 (+2740.91%)
Mutual labels:  economics
Econ-Data-Science
Articles/ Journals and Videos related to Economics📈 and Data Science 📊
Stars: ✭ 102 (+363.64%)
Mutual labels:  economics
fundamental
Software to look for interrelationships between constants and find formulas for number sequences
Stars: ✭ 14 (-36.36%)
Mutual labels:  economics
Mind Expanding Books
📚 Books everyone should read!
Stars: ✭ 7,151 (+32404.55%)
Mutual labels:  economics
Awesome Economics
A curated collection of links for economists
Stars: ✭ 688 (+3027.27%)
Mutual labels:  economics
Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+19600%)
Mutual labels:  economics

PSID.jl

Build Status codecov

The Panel Study of Income Dynamics (PSID) is a longitudinal public dataset which has been following a collection of families and their descendants since 1968. It provides a breadth of information about labor supply and life-cycle dynamics. More information is available at https://psidonline.isr.umich.edu/.

This package produces a labeled panel of individuals with a consistent individual ID across time. You provide a JSON file describing the variables you want. An example input file can be found at examples/user_input.json.. Currently only variables in the family files can be added, but in the future it should be possible to support variables in the individual files or the supplements.

Example

An example workflow can be found on my blog post here.

Instructions

To add this package, use

(@v1.4) pkg> add PSID

Next, download the PSID data files yourself. The package can't automatically fetch them because the PSID requires you to register for a free account before using the data.

The list of data files required to be in the current directory can be found here. These files are

  1. The PSID codebook in XML format. You can download this from me here https://drive.google.com/open?id=1nz1UaVGcj0ur2Bp3ev7a8agJbj0A5JTF . In the future there will be a way to download this from the PSID directly.
  2. The zipped PSID family files and cross-year individual file, which can be downloaded here https://simba.isr.umich.edu/Zips/ZipMain.aspx. Do not extract the files--leave them zipped. You need to download every family file from 1968 to 2017, and you also need to download the cross-year individual file.
  3. The XLSX cross-year index for the variables, which can be downloaded here https://psidonline.isr.umich.edu/help/xyr/psid.xlsx.

After acquiring the data, run

julia> using PSID
julia> makePSID("user_input.json")
# to not code missings, makePSID("user_input.json", codemissings = false)

It will verify the required files exist and then construct the data. If successful, it will print Finished constructing individual data, saved to output/allinds.csv after about 5 minutes.

The input JSON file

The file passed to makePSID describes the variables you want.

{
    "name_user": "hours",
    "varID": "V465",
    "unit": "head"
  },

There are three fields, name_user, varID, and unit. name_user is a name chosen by you. varID is one of the codes assigned by the PSID to this variable. These can be looked up in the PSID cross-year index. For example, hours above can be found in the crosswalk at Family Public Data Index 01>WORK 02>Hours and Weeks 03>annual in prior year 04>head 05>total:. Clicking on the variable info will show the the list of years and associated IDs when that variable is available. Choose any of the IDs for varID, it does not matter. PSID.jl will look up all available years for that variable in the crosswalk. You must also indicate the unit, which can be head, spouse, or family. This makes sure the variable is assigned to the correct individual.

Features

This package provides the following features:

  1. Automatically labels missing values by searching the value labels from the codebook for strings like "NA", "Inap.", or "Missing".
  2. Tries to produce consistent value labels across years for categorical variables. This is difficult because the labels in the PSID sometimes change between years. This package uses an algorithm to try to harmonize the labels when possible by removing common subsets. For example, in one year race is labeled as "Asian" but in the next year it is "Asian, Pacific Islander". The first is a subset of the second, so the final label will be "Asian, Pacific Islander". When this is not possible, the final label will be "A or B or C" for however many incomparable labels were found.
  3. Matches the individuals across time to produce a panel with consistent (ID, year) keys and their associated variables.
  4. Produces consistent individual or spouse variables for individuals. In the input JSON file, you must indicate whether a variable is family level, household head level, or household spouse level. The final output will have variables of the form VAR_family, VAR_ind, or VAR_spouse. When the individual is a household head, VAR_ind will come from the household head version of that variable, and VAR_spouse will come from the household spouse version. If the individual is a household spouse, it is the reverse. Both individuals will get all family level variables.
  5. It's easiest to track individuals, but this package also produces a consistent family ID by treating a family as a combination of head and spouse (if spouse exists). If you keep only household heads and drop years before 1970, (famid, year) should be an ID.

Notable Omissions

Certain variables are not in the family files. For example, the wealth data are in separate files, and there is some unique information in the individual file directly. In the future I plan to add support for these data, but you can manually add them by constructing the unique individual ID yourself as (ER30001 * 1000) + ER30002, and then joining your data on that ID with the dataset produced by PSID.jl.

Please file issues if you find a bug.

Donate your input JSON

If you've made an input JSON file containing variables useful for some topic, feel free to file an issue or make a PR to add your file to the examples.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].