All Projects → dmsul → econtools

dmsul / econtools

Licence: BSD-3-Clause license
Econometrics and data manipulation functions.

Programming Languages

python
139335 projects - #7 most used programming language
Stata
111 projects

Projects that are alternatives of or similar to econtools

Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+3183.33%)
Mutual labels:  regression, scipy
ARCHModels.jl
A Julia package for estimating ARMA-GARCH models.
Stars: ✭ 63 (-34.37%)
Mutual labels:  regression, econometrics
Microeconometrics.jl
Microeconometric estimation in Julia
Stars: ✭ 30 (-68.75%)
Mutual labels:  regression, econometrics
hdfe
No description or website provided.
Stars: ✭ 22 (-77.08%)
Mutual labels:  regression, econometrics
FixedEffectjlr
R interface for Fixed Effect Models
Stars: ✭ 20 (-79.17%)
Mutual labels:  regression, econometrics
lolo
A random forest
Stars: ✭ 37 (-61.46%)
Mutual labels:  regression
broomExtra
Helpers for regression analyses using `{broom}` & `{easystats}` packages 📈 🔍
Stars: ✭ 45 (-53.12%)
Mutual labels:  regression
armagarch
ARMA-GARCH
Stars: ✭ 59 (-38.54%)
Mutual labels:  econometrics
techloop-ml-plus
Archives and Tasks for ML+ sessions
Stars: ✭ 23 (-76.04%)
Mutual labels:  scipy
mugshot
Framework independent visual testing library
Stars: ✭ 126 (+31.25%)
Mutual labels:  regression
LinearRegression.jl
Linear Regression for Julia
Stars: ✭ 12 (-87.5%)
Mutual labels:  regression
Regression
Multiple Regression Package for PHP
Stars: ✭ 88 (-8.33%)
Mutual labels:  regression
DataScience ArtificialIntelligence Utils
Examples of Data Science projects and Artificial Intelligence use cases
Stars: ✭ 302 (+214.58%)
Mutual labels:  regression
R-stats-machine-learning
Misc Statistics and Machine Learning codes in R
Stars: ✭ 33 (-65.62%)
Mutual labels:  regression
object-detection-with-svm-and-opencv
detect objects using svm and opencv
Stars: ✭ 24 (-75%)
Mutual labels:  scipy
BeaData.jl
A Julia interface for retrieving data from the Bureau of Economic Analysis (BEA).
Stars: ✭ 17 (-82.29%)
Mutual labels:  econometrics
combining3Dmorphablemodels
Project Page of Combining 3D Morphable Models: A Large scale Face-and-Head Model - [CVPR 2019]
Stars: ✭ 80 (-16.67%)
Mutual labels:  regression
ml course
"Learning Machine Learning" Course, Bogotá, Colombia 2019 #LML2019
Stars: ✭ 22 (-77.08%)
Mutual labels:  regression
MLweb
Machine learning and scientific computing (linear algebra, statistics, optimization) javascript libraries, with an online lab.
Stars: ✭ 85 (-11.46%)
Mutual labels:  regression
lyapy
Library for simulation of nonlinear control systems, control design, and Lyapunov-based learning.
Stars: ✭ 35 (-63.54%)
Mutual labels:  regression

econtools

econtools is a Python package of econometric functions and convenient shortcuts for data work with pandas and numpy. Full documentation here.

Installation

You can install directly from PYPI:

$ pip install econtools

Or you can clone from Github and install directly.

$ git clone http://github.com/dmsul/econtools
$ cd econtools
$ python setup.py install

Econometrics

  • OLS, 2SLS, LIML
  • Option to absorb any variable via within-transformation (a la areg in Stata)
  • Robust standard errors
    • HAC (robust/hc1, hc2, hc3)
    • Clustered standard errors
    • Spatial HAC (SHAC, aka Conley standard errors) with uniform and triangle kernels
  • F-tests by variable name or R matrix.
  • Local linear regression.
  • WARNING [31 Oct 2019]: Predicted values (yhat and residuals) may not be as expected in transformed regressions (when using fixed effects or using weights). That is, the current behavior is different from Stata. I am looking into this and will post a either a fix or a justification of current behavior in the near future.
import econtools
import econtools.metrics as mt

# Read Stata DTA file
df = econtools.read('my_data.dta')

# Estimate OLS regression with fixed-effects and clustered s.e.'s
result = mt.reg(df,                     # DataFrame to use
                'y',                    # Outcome
                ['x1', 'x2'],           # Indep. Variables
                fe_name='person_id',    # Fixed-effects using variable 'person_id'
                cluster='state'         # Cluster by state
)

# Results
print(result.summary)                                # Print regression results
beta_x1 = result.beta['x1']                          # Get coefficient by variable name
r_squared = result.r2a                               # Get adjusted R-squared
joint_F = result.Ftest(['x1', 'x2'])                 # Test for joint significance
equality_F = result.Ftest(['x1', 'x2'], equal=True)  # Test for coeff. equality

Regression and Summary Stat Tables

  • outreg takes regression results and creates a LaTeX-formatted tabular fragment.
  • table_statrow can be used to add arbitrary statistics, notes, etc. to a table. Can also be used to create a table of summary statistics.
  • write_notes makes it easy to save table notes that depend on your data.

Misc. Data Manipulation Tools

  • stata_merge wraps pandas.merge and adds a lot of Stata's merge niceties like a '_m' flag for successfully merge observations.
  • group_id generates an ID based on the variables past (compare egen group).
  • Crosswalks of commonly used U.S. state labels.
    • State abbreviation to state name (and reverse).
    • State fips to state name (and reverse).

Data I/O

  • read and write: Use the passed file path's extension to determine which pandas I/O method to use. Useful for writing functions that programmatically read DataFrames from disk which are saved in different formats. See examples above and below.

  • load_or_build: A function decorator that caches datasets to disk. This function builds the requested dataset and saves it to disk if it doesn't already exist on disk. If the dataset is already saved, it simply loads it, saving computational time and allowing the use of a single function to both load and build data.

    from econtools import load_or_build, read
    
    @load_or_build('my_data_file.dta')
    def build_my_data_file():
      """
      Cleans raw data from CSV format and saves as Stata DTA.
      """
      df = read('raw_data.csv')
      # Clean the DataFrame
      return df

    File type is automatically detected from the passed filename. In this case, Stata DTA from my_data_file.dta.

  • save_cli: Simple wrapper for argparse that let's you use a --save flag on the command line. This lets you run a regression without over-writing the previous results and without modifying the code in any way (i.e., commenting out the "save" lines).

    In your regression script:

    from econtools import save_cli
    
    def regression_table(save=False):
      """ Run a regression and save output if `save == True`.  """ 
      # Regression guts
    
    
    if __name__ == '__main__':
        save = save_cli()
        regression_table(save=save)

    In the command line/bash script:

    python run_regression.py          # Runs regression without saving output
    python run_regression.py --save   # Runs regression and saves output

Requirements

  • Python 3.6+
  • Pandas and its dependencies (Numpy, etc.)
  • Scipy and its dependencies
  • Pytables (optional, if you use HDF5 files)
  • PyTest (optional, if you want to run the tests)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].