All Projects → yromano → cqr

yromano / cqr

Licence: other
Conformalized Quantile Regression

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to cqr

Github-Stars-Predictor
It's a github repo star predictor that tries to predict the stars of any github repository having greater than 100 stars.
Stars: ✭ 34 (-77.63%)
Mutual labels:  random-forest, prediction
STOCK-RETURN-PREDICTION-USING-KNN-SVM-GUASSIAN-PROCESS-ADABOOST-TREE-REGRESSION-AND-QDA
Forecast stock prices using machine learning approach. A time series analysis. Employ the Use of Predictive Modeling in Machine Learning to Forecast Stock Return. Approach Used by Hedge Funds to Select Tradeable Stocks
Stars: ✭ 94 (-38.16%)
Mutual labels:  random-forest, prediction
Topics-In-Modern-Statistical-Learning
Materials for STAT 991: Topics In Modern Statistical Learning (UPenn, 2022 Spring) - uncertainty quantification, conformal prediction, calibration, etc
Stars: ✭ 74 (-51.32%)
Mutual labels:  prediction, conformal-prediction
forestError
A Unified Framework for Random Forest Prediction Error Estimation
Stars: ✭ 23 (-84.87%)
Mutual labels:  random-forest, prediction
python-neuron
Neuron class provides LNU, QNU, RBF, MLP, MLP-ELM neurons
Stars: ✭ 38 (-75%)
Mutual labels:  prediction
Chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (+15.79%)
Mutual labels:  random-forest
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (+13.82%)
Mutual labels:  random-forest
Emlearn
Machine Learning inference engine for Microcontrollers and Embedded devices
Stars: ✭ 154 (+1.32%)
Mutual labels:  random-forest
Market-Trend-Prediction
This is a project of build knowledge graph course. The project leverages historical stock price, and integrates social media listening from customers to predict market Trend On Dow Jones Industrial Average (DJIA).
Stars: ✭ 57 (-62.5%)
Mutual labels:  prediction
Data-Science
Using Kaggle Data and Real World Data for Data Science and prediction in Python, R, Excel, Power BI, and Tableau.
Stars: ✭ 15 (-90.13%)
Mutual labels:  prediction
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+1973.68%)
Mutual labels:  random-forest
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (+15.79%)
Mutual labels:  random-forest
decision-trees-for-ml
Building Decision Trees From Scratch In Python
Stars: ✭ 61 (-59.87%)
Mutual labels:  random-forest
Randomforestexplainer
A set of tools to understand what is happening inside a Random Forest
Stars: ✭ 175 (+15.13%)
Mutual labels:  random-forest
doc2vec pymongo
Machine learning prediction of movies genres using Gensim's Doc2Vec and PyMongo - (Python, MongoDB)
Stars: ✭ 36 (-76.32%)
Mutual labels:  prediction
Machine Learning Models
Decision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (+5.26%)
Mutual labels:  random-forest
Quickml
A fast and easy to use decision tree learner in java
Stars: ✭ 230 (+51.32%)
Mutual labels:  random-forest
RVM-MATLAB
MATLAB code for Relevance Vector Machine using SB2_Release_200.
Stars: ✭ 38 (-75%)
Mutual labels:  prediction
Shifu
An end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+36.18%)
Mutual labels:  random-forest
Textclassification
several methods for text classification
Stars: ✭ 180 (+18.42%)
Mutual labels:  random-forest

Reliable Predictive Inference

An important factor to guarantee a responsible use of data-driven recommendation systems is that we should be able to communicate their uncertainty to decision makers. This can be accomplished by constructing prediction intervals, which provide an intuitive measure of the limits of predictive performance.

This package contains a Python implementation of Conformalized quantile regression (CQR) [1] methodology for constructing marginal distribusion-free prediction intervals. It also implements the equalized coverage framework [2] that builds valid group-conditional prediction intervals.

Conformalized Quantile Regression [1]

CQR is a technique for constructing prediction intervals that attain valid coverage in finite samples, without making distributional assumptions. It combines the statistical efficiency of quantile regression with the distribution-free coverage guarantee of conformal prediction. On one hand, CQR is flexible in that it can wrap around any algorithm for quantile regression, including random forests and deep neural networks. On the other hand, a key strength of CQR is its rigorous control of the miscoverage rate, independent of the underlying regression algorithm.

[1] Yaniv Romano, Evan Patterson, and Emmanuel J. Candes, “Conformalized quantile regression.” 2019.

Equalized Coverage [2]

To support equitable treatment, the equalized coverage methodology forces the construction of the prediction intervals to be unbiased in the sense that their coverage must be equal across all protected groups of interest. Similar to CQR and conformal inference, equalized coverage offers rigorous distribution-free guarantees that hold in finite samples. This methodology can also be viewed as a wrapper around any predictive algorithm.

[2] Y. Romano, R. F. Barber, C. Sabbatti and E. J. Candès, “With malice towards none: Assessing uncertainty via equalized coverage.” 2019.

Getting Started

This package is self-contained and implemented in python.

Part of the code is a taken from the nonconformist package available at https://github.com/donlnz/nonconformist. One may refer to the nonconformist repository to view other applications of conformal prediction.

Prerequisites

  • python
  • numpy
  • scipy
  • scikit-learn
  • scikit-garden
  • pytorch
  • pandas

Installing

The development version is available here on github:

git clone https://github.com/yromano/cqr.git

Usage

CQR

Please refer to cqr_real_data_example.ipynb for basic usage. Comparisons to competitive methods and additional usage examples of this package can be found in cqr_synthetic_data_example_1.ipynb and cqr_synthetic_data_example_2.ipynb.

Equalized Coverage

The notebook detect_prediction_bias_example.ipynb performs simple data analysis for MEPS 21 data set and detects bias in the prediction. The notebook equalized_coverage_example.ipynb illustrates how to run the methods proposed in [2] and construct prediction intervals with equal coverage across groups.

Reproducible Research

The code available under /reproducible_experiments/ in the repository replicates the experimental results in [1] and [2].

Publicly Available Datasets

  • Blog: BlogFeedback data set.

  • Bio: Physicochemical properties of protein tertiary structure data set.

  • Bike: Bike sharing dataset data set.

  • Community: Communities and crime data set.

  • STAR: C.M. Achilles, Helen Pate Bain, Fred Bellott, Jayne Boyd-Zaharias, Jeremy Finn, John Folger, John Johnston, and Elizabeth Word. Tennessee’s Student Teacher Achievement Ratio (STAR) project, 2008.

  • Concrete: Concrete compressive strength data set.

  • Facebook Variant 1 and Variant 2: Facebook comment volume data set.

Data subject to copyright/usage rules

The Medical Expenditure Panel Survey (MPES) data can be downloaded using the code in the folder /get_meps_data/ under this repository. It is based on this explanation (code provided by IBM's AIF360).

  • MEPS_19: Medical expenditure panel survey, panel 19.

  • MEPS_20: Medical expenditure panel survey, panel 20.

  • MEPS_21: Medical expenditure panel survey, panel 21.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].