All Projects → synthesized-io → fairlens

synthesized-io / fairlens

Licence: BSD-3-Clause License
Identify bias and measure fairness of your data

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to fairlens

validada
Another library for defensive data analysis.
Stars: ✭ 29 (-43.14%)
Mutual labels:  data, pandas, data-analysis
Sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
Stars: ✭ 1,851 (+3529.41%)
Mutual labels:  statistics, pandas, data-analysis
Pandas Profiling
Create HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+16231.37%)
Mutual labels:  statistics, pandas, data-analysis
Data Forge Ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+1796.08%)
Mutual labels:  data, pandas, data-analysis
Data Forge Js
JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 139 (+172.55%)
Mutual labels:  data, pandas, data-analysis
Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+2009.8%)
Mutual labels:  data, statistics, data-analysis
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+435.29%)
Mutual labels:  data, pandas, data-analysis
Pandas Datareader
Extract data from a wide range of Internet sources into a pandas DataFrame.
Stars: ✭ 2,183 (+4180.39%)
Mutual labels:  data, pandas, data-analysis
Data-Analyst-Nanodegree
Kai Sheng Teh - Udacity Data Analyst Nanodegree
Stars: ✭ 42 (-17.65%)
Mutual labels:  statistics, pandas, data-analysis
Google-Data-Analytics-Professional-Certificate
Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.
Stars: ✭ 19 (-62.75%)
Mutual labels:  data, data-analysis
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-41.18%)
Mutual labels:  pandas, data-analysis
hdfe
No description or website provided.
Stars: ✭ 22 (-56.86%)
Mutual labels:  statistics, pandas
ipython-notebooks
A collection of Jupyter notebooks exploring different datasets.
Stars: ✭ 43 (-15.69%)
Mutual labels:  pandas, data-analysis
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+721.57%)
Mutual labels:  pandas, data-analysis
FairAI
This is a collection of papers and other resources related to fairness.
Stars: ✭ 55 (+7.84%)
Mutual labels:  bias, fairness
facerec-bias-bfw
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).
Stars: ✭ 40 (-21.57%)
Mutual labels:  data-analysis, bias
veridical-flow
Making it easier to build stable, trustworthy data-science pipelines.
Stars: ✭ 28 (-45.1%)
Mutual labels:  statistics, pandas
yt-channels-DS-AI-ML-CS
A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+1935.29%)
Mutual labels:  statistics, data-analysis
LFM1b-analyses
Python scripts for studying bias in recommender systems
Stars: ✭ 18 (-64.71%)
Mutual labels:  bias, fairness
Algorithmic-Trading
I have been deeply interested in algorithmic trading and systematic trading algorithms. This Repository contains the code of what I have learnt on the way. It starts form some basic simple statistics and will lead up to complex machine learning algorithms.
Stars: ✭ 47 (-7.84%)
Mutual labels:  statistics, pandas

Open In Colab Documentation Status CI PyPI PyPI - Downloads Python version License Code style: black Maintainability Rating codecov GitHub Repo stars

FairLens

FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quickly identify bias, and provides multiple metrics to measure fairness across a range of sensitive and legally protected characteristics such as age, race and sex.

Bias in my data?

It's very simple to quickly start understanding any biases that may be present in your data.

import pandas as pd
import fairlens as fl

# Load in the data
df = pd.read_csv("datasets/compas.csv")

# Automatically generate a report
fscorer = fl.FairnessScorer(
    df,
    target_attribute="RawScore",
    sensitive_attributes=[
        "Sex",
        "Ethnicity",
        "MaritalStatus"
    ]
)
fscorer.demographic_report()
Sensitive Attributes: ['Ethnicity', 'MaritalStatus', 'Sex']

                         Group Distance  Proportion  Counts   P-Value
African-American, Single, Male    0.249    0.291011    5902 3.62e-251
      African-American, Single    0.202    0.369163    7487 1.30e-196
                       Married    0.301    0.134313    2724 7.37e-193
        African-American, Male    0.201    0.353138    7162 4.03e-188
                 Married, Male    0.281    0.108229    2195 9.69e-139
              African-American    0.156    0.444899    9023 3.25e-133
                      Divorced    0.321    0.063754    1293 7.51e-112
            Caucasian, Married    0.351    0.049504    1004 7.73e-106
                  Single, Male    0.121    0.582910   11822  3.30e-95
           Caucasian, Divorced    0.341    0.037473     760  1.28e-76

Weighted Mean Statistical Distance: 0.14081832462333957

Check out the documentation to get started, or try out FairLens now in Google Colab!

See some of our previous blog posts for our take on bias and fairness in ML:

Core Features

  • Bias Measurement - Metrics and tests to measure the extent and significance of bias in data using statistical distances and metrics. See the overview for more details.

  • Sensitive Attribute and Proxy Detection - Methods to identify legally protected features, and measure hidden correlations between these features and others.

  • Visualization Tools - Tools to visualize the distributions of different types of variables or columns in sensitive sub groups.

  • Fairness Assessment - A streamlined way of assessing the fairness of an arbitrary dataset, and generating reports highlighting biases and hidden correlations.

The goal of FairLens is to enable data scientists to gain a deeper understanding of their data, and helps to to ensure fair and ethical use of data in analysis and machine learning tasks. The insights gained from FairLens can be harnessed by the Bias Mitigation feature of the Synthesized platform, which is able to automagically remove bias using the power of synthetic data.

Installation

FairLens can be installed using pip

pip install fairlens

Contributing

FairLens is under active development, and we appreciate community contributions. See CONTRIBUTING.md for how to get started.

The repository's current roadmap is maintained as a Github project here.

License

This project is licensed under the terms of the BSD 3 license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].