All Projects → dswah → Pygam

dswah / Pygam

Licence: apache-2.0
[HELP REQUESTED] Generalized Additive Models in Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pygam

Awesome Scientific Python
A curated list of awesome scientific Python resources
Stars: ✭ 127 (-77.68%)
Mutual labels:  data-science, scientific-computing
Tiledb
The Universal Storage Engine
Stars: ✭ 1,072 (+88.4%)
Mutual labels:  data-science, scientific-computing
Learn Julia The Hard Way
Learn Julia the hard way!
Stars: ✭ 679 (+19.33%)
Mutual labels:  data-science, scientific-computing
Kneed
Knee point detection in Python 📈
Stars: ✭ 328 (-42.36%)
Mutual labels:  data-science, scientific-computing
Collapse
Advanced and Fast Data Transformation in R
Stars: ✭ 184 (-67.66%)
Mutual labels:  data-science, scientific-computing
Scilab
Free and Open Source software for numerical computation providing a powerful computing environment for engineering and scientific applications.
Stars: ✭ 138 (-75.75%)
Mutual labels:  data-science, scientific-computing
Reflow
A language and runtime for distributed, incremental data processing in the cloud
Stars: ✭ 706 (+24.08%)
Mutual labels:  data-science, scientific-computing
Matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
Stars: ✭ 2,433 (+327.59%)
Mutual labels:  data-science, scientific-computing
Roger
Golang RServe client. Use R from Go
Stars: ✭ 248 (-56.41%)
Mutual labels:  data-science, scientific-computing
Gop
GoPlus - The Go+ language for engineering, STEM education, and data science
Stars: ✭ 7,829 (+1275.92%)
Mutual labels:  data-science, scientific-computing
Lets Plot
An open-source plotting library for statistical data.
Stars: ✭ 531 (-6.68%)
Mutual labels:  data-science
Interpretable machine learning with python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Stars: ✭ 530 (-6.85%)
Mutual labels:  data-science
Probabilistic Programming And Bayesian Methods For Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Stars: ✭ 23,912 (+4102.46%)
Mutual labels:  data-science
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+832.34%)
Mutual labels:  data-science
Rumale
Rumale is a machine learning library in Ruby
Stars: ✭ 526 (-7.56%)
Mutual labels:  data-science
Ohpc
OpenHPC Integration, Packaging, and Test Repo
Stars: ✭ 544 (-4.39%)
Mutual labels:  scientific-computing
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (-7.38%)
Mutual labels:  data-science
Course V3
The 3rd edition of course.fast.ai
Stars: ✭ 4,785 (+740.95%)
Mutual labels:  data-science
Dapy
Easy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (-8.08%)
Mutual labels:  data-science
Alphapy
Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost
Stars: ✭ 564 (-0.88%)
Mutual labels:  data-science

Build Status Documentation Status PyPI version codecov python27 python36 DOI

pyGAM

Generalized Additive Models in Python.

Documentation

Installation

pip install pygam

scikit-sparse

To speed up optimization on large models with constraints, it helps to have scikit-sparse installed because it contains a slightly faster, sparse version of Cholesky factorization. The import from scikit-sparse references nose, so you'll need that too.

The easiest way is to use Conda:
conda install -c conda-forge scikit-sparse nose

scikit-sparse docs

Contributing - HELP REQUESTED

Contributions are most welcome!

You can help pyGAM in many ways including:

  • Working on a known bug.
  • Trying it out and reporting bugs or what was difficult.
  • Helping improve the documentation.
  • Writing new distributions, and link functions.
  • If you need some ideas, please take a look at the issues.

To start:

  • fork the project and cut a new branch
  • Now install the testing dependencies
conda install pytest numpy pandas scipy pytest-cov cython
pip install --upgrade pip
pip install -r requirements.txt

It helps to add a sym-link of the forked project to your python path. To do this, you should install flit:

  • pip install flit
  • Then from main project folder (ie .../pyGAM) do: flit install -s

Make some changes and write a test...

  • Test your contribution (eg from the .../pyGAM): py.test -s
  • When you are happy with your changes, make a pull request into the master branch of the main project.

About

Generalized Additive Models (GAMs) are smooth semi-parametric models of the form:

alt tag

where X.T = [X_1, X_2, ..., X_p] are independent variables, y is the dependent variable, and g() is the link function that relates our predictor variables to the expected value of the dependent variable.

The feature functions f_i() are built using penalized B splines, which allow us to automatically model non-linear relationships without having to manually try out many different transformations on each variable.

GAMs extend generalized linear models by allowing non-linear functions of features while maintaining additivity. Since the model is additive, it is easy to examine the effect of each X_i on Y individually while holding all other predictors constant.

The result is a very flexible model, where it is easy to incorporate prior knowledge and control overfitting.

Citing pyGAM

Please consider citing pyGAM if it has helped you in your research or work:

Daniel Servén, & Charlie Brummitt. (2018, March 27). pyGAM: Generalized Additive Models in Python. Zenodo. DOI: 10.5281/zenodo.1208723

BibTex:

@misc{daniel\_serven\_2018_1208723,
  author       = {Daniel Servén and
                  Charlie Brummitt},
  title        = {pyGAM: Generalized Additive Models in Python},
  month        = mar,
  year         = 2018,
  doi          = {10.5281/zenodo.1208723},
  url          = {https://doi.org/10.5281/zenodo.1208723}
}

References

  1. Simon N. Wood, 2006
    Generalized Additive Models: an introduction with R

  2. Hastie, Tibshirani, Friedman
    The Elements of Statistical Learning
    http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf

  3. James, Witten, Hastie and Tibshirani
    An Introduction to Statistical Learning
    http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.pdf

  4. Paul Eilers & Brian Marx, 1996 Flexible Smoothing with B-splines and Penalties http://www.stat.washington.edu/courses/stat527/s13/readings/EilersMarx_StatSci_1996.pdf

  5. Kim Larsen, 2015
    GAM: The Predictive Modeling Silver Bullet
    http://multithreaded.stitchfix.com/assets/files/gam.pdf

  6. Deva Ramanan, 2008
    UCI Machine Learning: Notes on IRLS
    http://www.ics.uci.edu/~dramanan/teaching/ics273a_winter08/homework/irls_notes.pdf

  7. Paul Eilers & Brian Marx, 2015
    International Biometric Society: A Crash Course on P-splines
    http://www.ibschannel2015.nl/project/userfiles/Crash_course_handout.pdf

  8. Keiding, Niels, 1991
    Age-specific incidence and prevalence: a statistical perspective

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].