All Projects → valeria-io → bias-in-credit-models

valeria-io / bias-in-credit-models

Licence: other
Examples of unfairness detection for a classification-based credit model

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to bias-in-credit-models

responsible-ai-toolbox
This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Stars: ✭ 615 (+3316.67%)
Mutual labels:  fairness, fairness-ai
wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Stars: ✭ 164 (+811.11%)
Mutual labels:  bias-detection, fairness-ai
LFM1b-analyses
Python scripts for studying bias in recommender systems
Stars: ✭ 18 (+0%)
Mutual labels:  fairness
deep-explanation-penalization
Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584
Stars: ✭ 110 (+511.11%)
Mutual labels:  fairness
interpretable-ml
Techniques & resources for training interpretable ML models, explaining ML models, and debugging ML models.
Stars: ✭ 17 (-5.56%)
Mutual labels:  fairness
coursera-gan-specialization
Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai
Stars: ✭ 277 (+1438.89%)
Mutual labels:  bias-detection
ClimateTools.jl
Climate science package for Julia
Stars: ✭ 108 (+500%)
Mutual labels:  bias-correction
algorithm-ethics
A collection of resources and tools designed to provide guidelines for ethical modeling.
Stars: ✭ 57 (+216.67%)
Mutual labels:  ethical-data-science
cqr
Conformalized Quantile Regression
Stars: ✭ 152 (+744.44%)
Mutual labels:  fairness
Awesome Machine Learning Interpretability
A curated list of awesome machine learning interpretability resources.
Stars: ✭ 2,404 (+13255.56%)
Mutual labels:  fairness
fairlens
Identify bias and measure fairness of your data
Stars: ✭ 51 (+183.33%)
Mutual labels:  fairness
aequitas
Fairness regulator and rate limiter
Stars: ✭ 49 (+172.22%)
Mutual labels:  fairness
vania
A module which fairly distributes a list of arbitrary objects among a set of targets, considering weights.
Stars: ✭ 75 (+316.67%)
Mutual labels:  fairness
easyFL
An experimental platform to quickly realize and compare with popular centralized federated learning algorithms. A realization of federated learning algorithm on fairness (FedFV, Federated Learning with Fair Averaging, https://fanxlxmu.github.io/publication/ijcai2021/) was accepted by IJCAI-21 (https://www.ijcai.org/proceedings/2021/223).
Stars: ✭ 104 (+477.78%)
Mutual labels:  fairness
themis-ml
A library that implements fairness-aware machine learning algorithms
Stars: ✭ 93 (+416.67%)
Mutual labels:  fairness
FairAI
This is a collection of papers and other resources related to fairness.
Stars: ✭ 55 (+205.56%)
Mutual labels:  fairness
Open-Sentencing
To help public defenders better serve their clients, Open Sentencing shows racial bias in data such as demographics providing insights for each case
Stars: ✭ 69 (+283.33%)
Mutual labels:  bias-detection
ml-fairness-framework
FairPut - Machine Learning Fairness Framework with LightGBM — Explainability, Robustness, Fairness (by @firmai)
Stars: ✭ 59 (+227.78%)
Mutual labels:  fairness-ai

bias-in-credit-models

Machine learning is being deployed to do large-scale decision making, which can strongly impact the life of individuals. By not considering and analysing such scenarios, we may end up building models that fail to treat societies equally and even infringe anti-discrimination laws.

There are several algorithmic interventions to identify unfair treatment based on what is considered to be fair. This project focuses on showing how these interventions can be applied in a case study using a classification-based credit model.

Case Study Outline

I made use of a public loan book from Bondora, a P2P lending platform based in Estonia. I looked into two different protected groups: gender and age.

Bondora provides lending to less credit-worthy customers, with the presence of much higher default rates than seen in traditional banks. This means that the interests collected are significantly higher. On average, the loan amount for this dataset was around €2,100 with a payment duration of 38 months and interest rate of 26.30%. 

For traditional banks, the cost of a false positive (misclassifying a defaulting loan) is many times greater than reward of a true positive (correctly classifying a non-defaulting loan). Given the higher interest rates collected by Bondora compared to banks, I will assume for illustration purposes that the reward to cost ratio is much smaller at 1 to 2. This will be used to find the best thresholds to maximise profits while meeting all requirements for each algorithmic intervention.

I then developed a classification model that predicts whether a loan is likely to be paid back or not using the technique Gradient Boosted Decision Trees. With the results of the model predictions, I then analysed the following scenarios:

  • Maximise profit uses different classification thresholds for each group and only aims at maximising profit. Fairness through unawareness uses the same classification threshold for all groups while maximising profit.
  • Demographic parity applies different classification thresholds for each group, while keeping the same fraction of positives in each group.
  • Equal opportunity uses different classification thresholds for each group, while keeping the same true positive rate in each group.
  • Equalised odds applies different classification thresholds for each group, while keeping the same true positive rate and false positive rate in each group.

Project Structure

I) Data Cleaning

pre_process.py restructures the data by setting it in the right format and renaming as needed for visualisation. The file fill_missing_values.py make restructure the data and fill missing values that will be later used in the modeling phase.

II) Data Exploration

Both notebook take the processed and restructured data and plots the distributions, correlations and missing data.

III) Credit Model

Does a grid search to find the best model using the technique Gradient Boosted Decision Trees. After finding the best model, it saves the predictions and the original data as CSV.

IV) Model Analysis and Unfairness Detection

  • model_performance.ipynb: Reviews the performance of the model using ROC curves and AUC for 'Gender' and 'Age Group.
  • unfairness_measures.py: Finds the best thresholds for each protected class by maximising profits whie meeting each algorithmic intervention requirements. This then saves all results as CSV.
  • model_fairness_interventions.ipynb: Reviews the results for from unfairness_measures.py.

More Information

For more information on each algorithmic intervention and the intepretation of the case study results, go to: https://medium.com/@ValeriaCortezVD/preventing-discriminatory-outcomes-in-credit-models-39e1c6540353

Contact

References

Data

Bondora’s loan book. Available at: https://www.bondora.com/en/public-reports [Accessed August 18, 2018]

Main Literature

Barocas, S., Hardt, M. & Narayanan, A., 2018. Fairness and machine learning. Available at: http://fairmlbook.org/ [Accessed August 29, 2018].

Dwork, C. et al., 2012. Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. ITCS ’12. New York, NY, USA: ACM, pp. 214–226.

Hardt, M. et al., 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. pp. 3315–3323.

Pedreshi, D., Ruggieri, S. & Turini, F., 2008. Discrimination-aware Data Mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’08. New York, NY, USA: ACM, pp. 560–568.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].