Tutorial: fairness in regression

In this short tutorial, we show how to check if the regression model discriminates a particular subgroup using the dalex package.

This approach is experimental and we are grateful for all the feedback. It was implemented according to Steinberg, D., et al. (2020).

This notebook aims to show how to detect the bias in regression models. It won't cover the fairness concepts and interpretation of the plots in details. For starters it is best to get familiar with our fairness in classification materials:

In [1]:
import pandas as pd 
import numpy as np

import plotly
plotly.offline.init_notebook_mode()

Data

We use the Communitties and Crime data from the paper and aim to predict the ViolentCrimesPerPop variable (total number of violent crimes per 100K population).

The protected attribute is the racepctblack value (part of the population identifying as black), which is the same one picked by the paper's authors.

In [2]:
data = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.data", header=None, na_values=["?"])
from urllib.request import urlopen
names = urlopen("http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.names")
columns = [line.split(b' ')[1].decode("utf-8") for line in names if line.startswith(b'@attribute')]
data.columns = columns
data = data.dropna(axis = 1)
data = data.iloc[:, 3:]
data.head()
Out[2]:
population householdsize racepctblack racePctWhite racePctAsian racePctHisp agePct12t21 agePct12t29 agePct16t24 agePct65up ... PctForeignBorn PctBornSameState PctSameHouse85 PctSameCity85 PctSameState85 LandArea PopDens PctUsePubTrans LemasPctOfficDrugUn ViolentCrimesPerPop
0 0.19 0.33 0.02 0.90 0.12 0.17 0.34 0.47 0.29 0.32 ... 0.12 0.42 0.50 0.51 0.64 0.12 0.26 0.20 0.32 0.20
1 0.00 0.16 0.12 0.74 0.45 0.07 0.26 0.59 0.35 0.27 ... 0.21 0.50 0.34 0.60 0.52 0.02 0.12 0.45 0.00 0.67
2 0.00 0.42 0.49 0.56 0.17 0.04 0.39 0.47 0.28 0.32 ... 0.14 0.49 0.54 0.67 0.56 0.01 0.21 0.02 0.00 0.43
3 0.04 0.77 1.00 0.08 0.12 0.10 0.51 0.50 0.34 0.21 ... 0.19 0.30 0.73 0.64 0.65 0.02 0.39 0.28 0.00 0.12
4 0.01 0.55 0.02 0.95 0.09 0.05 0.38 0.38 0.23 0.36 ... 0.11 0.72 0.64 0.61 0.53 0.04 0.09 0.02 0.00 0.03

5 rows × 100 columns

In [3]:
X = data.drop('ViolentCrimesPerPop', axis=1)
y = data.ViolentCrimesPerPop

Models

We make two regressor models: a simple and interpretable Decision Tree and a more complex and accurate Gradient Boosting.

In [4]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.tree import DecisionTreeRegressor
In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
In [6]:
model = GradientBoostingRegressor()
model.fit(X_train, y_train)

model2 = DecisionTreeRegressor()
model2.fit(X_train, y_train)
Out[6]:
DecisionTreeRegressor()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Explainers

In the next step we make the Explainer objects using dalex.

In [7]:
import dalex as dx
print(dx.__version__)
1.7.0
In [8]:
exp = dx.Explainer(model, X_test, y_test, verbose=False)
exp2 = dx.Explainer(model2, X_test, y_test, verbose=False)
In [9]:
pd.concat([exp2.model_performance().result, exp.model_performance().result])
Out[9]:
mse rmse r2 mae mad
DecisionTreeRegressor 0.033991 0.184366 0.300947 0.124188 0.070000
GradientBoostingRegressor 0.016992 0.130352 0.650551 0.088860 0.058268

Fairness

Having Explainers, we are able to assess models' fairness. To make sure that the models are fair, we will be checking three independence criteria. These are:

  • independence: R⊥A
  • separation: R⊥A ∣ Y
  • sufficiency: Y⊥A ∣ R

Where:

  • A - protected group
  • Y - target
  • R - model's prediction

In the approach described in Steinberg, D., et al. (2020), the authors propose a way of checking this independence.

The method implemented in the dalex package is called Direct Density Ratio Estimation.

In [10]:
protected = np.where(X_test.racepctblack >= 0.5, 'majority_black', "else")
privileged = 'else'
In [11]:
fobject = exp.model_fairness(protected, privileged)
fobject2 = exp2.model_fairness(protected, privileged)
In [12]:
fobject.fairness_check()
Bias detected in 2 metrics: independence, separation

Conclusion: your model is not fair because 2 or more criteria exceeded acceptable limits set by epsilon.

Ratios of metrics, based on 'else'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
                independence  separation  sufficiency
subgroup                                             
majority_black     10.554556    3.280994     1.080355
In [13]:
fobject2.fairness_check()
Bias detected in 3 metrics: independence, separation, sufficiency

Conclusion: your model is not fair because 2 or more criteria exceeded acceptable limits set by epsilon.

Ratios of metrics, based on 'else'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
                independence  separation  sufficiency
subgroup                                             
majority_black      2.899064    1.517669     1.636438

The models are biased!

Decision Tree model violated 3 criteria while Gradient Boosting violated only 2. We can plot the fairness check in the same way as for classification.

In [14]:
fobject2.plot()

One can easily plot the models together.

In [15]:
fobject2.plot(fobject)

We plot the models' output using the density type.

In [16]:
fobject.plot(fobject2, type='density')

Moreover, the method will acknowledge that there is no discrimination. To show this, let's pick another protected group. This time we choose racePctAsian.

In [17]:
protected = np.where(X_test.racePctAsian >= 0.5, 'majority_asian', "else")
privileged = 'else'

fobject = exp.model_fairness(protected, privileged)
fobject2 = exp2.model_fairness(protected, privileged)
In [18]:
fobject2.plot(fobject)

We can see that there is no discrimination towards the Asian community in this model (based on this data).

Summary

The new functionality allows the user to check the fairness of regression model. However, it should be noted that this is an experimental approach and the output of methods should be treated as a suggestion rather than a definite result.

Plots

This package uses plotly to render the plots:

Resources - https://dalex.drwhy.ai/python