Main Content

Bias Mitigation in Credit Scoring by Reweighting

Bias mitigation is the process of removing bias from a data set or a model in order to make it fair. Bias mitigation usually follows a bias detection step, where a series of metrics are computed based on a data set or model predictions. Bias mitigation has three stages: pre-processing, in-processing, and post-processing. This example demonstates a pre-processing method to mitigate bias in a credit scoring workflow. The example uses bias detection and bias mitigation functionality from the Statistics and Machine Learning Toolbox™. For a detailed example on bias detection, see the following example: Explore Fairness Metrics for Credit Scoring Model.

The bias mitigation method in this example is Reweighting which essentially reweights observations within a data set to guarantee fairness between different subgroups within a sensitive attribute. As a result of reweighting, the Statistical Parity Difference (SPD) of all subgroups goes to 0 and the Disparate Impact metric becomes 1. This example demonstrates how reweighting works in a credit scoring workflow.

Load Data

Load the CreditCardData data set and discretize the 'CustAge' predictor.

load CreditCardData.mat

AgeGroup = discretize(data.CustAge,[min(data.CustAge) 30 45 60 max(data.CustAge)], ...
    'categorical',{'Age < 30','30 <= Age < 45','45 <= Age < 60','Age >= 60'});

data = addvars(data,AgeGroup,'After','CustAge');
head(data)
    CustID    CustAge       AgeGroup       TmAtAddress    ResStatus     EmpStatus    CustIncome    TmWBank    OtherCC    AMBalance    UtilRate    status
    ______    _______    ______________    ___________    __________    _________    __________    _______    _______    _________    ________    ______

      1         53       45 <= Age < 60        62         Tenant        Unknown        50000         55         Yes       1055.9        0.22        0   
      2         61       Age >= 60             22         Home Owner    Employed       52000         25         Yes       1161.6        0.24        0   
      3         47       45 <= Age < 60        30         Tenant        Employed       37000         61         No        877.23        0.29        0   
      4         50       45 <= Age < 60        75         Home Owner    Employed       53000         20         Yes       157.37        0.08        0   
      5         68       Age >= 60             56         Home Owner    Employed       53000         14         Yes       561.84        0.11        0   
      6         65       Age >= 60             13         Home Owner    Employed       48000         59         Yes       968.18        0.15        0   
      7         34       30 <= Age < 45        32         Home Owner    Unknown        32000         26         Yes       717.82        0.02        1   
      8         50       45 <= Age < 60        57         Other         Employed       51000         33         No        3041.2        0.13        0   

Split the data set into training and testing data. Use the training data to fit the model and the testing data to predict from the model.

rng('default');
c = cvpartition(size(data,1),'HoldOut',0.3);
data_Train = data(c.training(),:);
data_Test = data(c.test(),:);

Compute Fairness Metrics at Predictor and Model Level

Compute the fairness metrics for the training data by creating a fairnessMetrics object and then generating a metrics report using report. Since you are only working with data and there is no fitted model, only two bias metrics are computed for StatisticalParityDifference and DisparateImpact. The two group metrics computed are GroupCount and GroupSizeRatio. The fairness metrics are computed for two sensitive attributes, Age ('AgeGroup') and Residential Status ('ResStatus').

trainingDataMetrics = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames',{'AgeGroup', 'ResStatus'});
tdmReport = report(trainingDataMetrics)
tdmReport=7×4 table
    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact
    _______________________    ______________    ___________________________    _______________

           AgeGroup            Age < 30                   0.039827                   1.1357    
           AgeGroup            30 <= Age < 45             0.096324                   1.3282    
           AgeGroup            45 <= Age < 60                    0                        1    
           AgeGroup            Age >= 60                  -0.19181                  0.34648    
           ResStatus           Home Owner                        0                        1    
           ResStatus           Tenant                      0.01689                   1.0529    
           ResStatus           Other                      -0.02108                  0.93404    

figure
tiledlayout(2,1)
nexttile
plot(trainingDataMetrics,'spd')
nexttile
plot(trainingDataMetrics,'di')

Looking at the DisparateImpact bias metric for both AgeGroup and ResStatus, you can see that there is a much larger variance in the AgeGroup predictor as compared to the ResStatus predictor. This suggests that users are treated more unfairly when it comes to their age as compared to their residential status. This example focuses on the AgeGroup predictor and attempts to reduce bias among its subgroups.

To begin, fit a credit scoring model and compute the model-level bias metrics. This provides a baseline for comparison.

Since CustAge and AgeGroup are essentially the same predictor and this is a sensitive attribute, you can exclude it from the model. Additionally, you can use 'status' as the response variable and 'CustID' as the ID variable.

PredictorVars = setdiff(data_Train.Properties.VariableNames, ...
    {'CustAge','AgeGroup','CustID','FairWeights','status'});
sc1 = creditscorecard(data_Train,'IDVar','CustID', ...
    'PredictorVars',PredictorVars,'ResponseVar','status');
sc1 = autobinning(sc1);
sc1 = fitmodel(sc1,'VariableSelection','fullmodel');
Generalized linear regression model:
    logit(status) ~ 1 + TmAtAddress + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance + UtilRate
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE        tStat        pValue  
                   ________    ________    ________    __________

    (Intercept)     0.73924    0.077237      9.5711     1.058e-21
    TmAtAddress      1.2577     0.99118      1.2689       0.20448
    ResStatus         1.755       1.295      1.3552       0.17535
    EmpStatus       0.88652     0.32232      2.7504     0.0059516
    CustIncome      0.95991     0.19645      4.8862    1.0281e-06
    TmWBank           1.132      0.3157      3.5856    0.00033637
    OtherCC         0.85227      2.1198     0.40204       0.68765
    AMBalance        1.0773     0.31969      3.3698    0.00075232
    UtilRate       -0.19784     0.59565    -0.33214       0.73978


840 observations, 831 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 66.5, p-value = 2.44e-11
pointsinfo1 = displaypoints(sc1)
pointsinfo1=38×3 table
      Predictors              Bin            Points  
    _______________    _________________    _________

    {'TmAtAddress'}    {'[-Inf,9)'     }     -0.17538
    {'TmAtAddress'}    {'[9,16)'       }      0.05434
    {'TmAtAddress'}    {'[16,23)'      }     0.096897
    {'TmAtAddress'}    {'[23,Inf]'     }      0.13984
    {'TmAtAddress'}    {'<missing>'    }          NaN
    {'ResStatus'  }    {'Tenant'       }    -0.017688
    {'ResStatus'  }    {'Home Owner'   }      0.11681
    {'ResStatus'  }    {'Other'        }      0.29011
    {'ResStatus'  }    {'<missing>'    }          NaN
    {'EmpStatus'  }    {'Unknown'      }    -0.097582
    {'EmpStatus'  }    {'Employed'     }      0.33162
    {'EmpStatus'  }    {'<missing>'    }          NaN
    {'CustIncome' }    {'[-Inf,30000)' }     -0.61962
    {'CustIncome' }    {'[30000,36000)'}     -0.10695
    {'CustIncome' }    {'[36000,40000)'}    0.0010845
    {'CustIncome' }    {'[40000,42000)'}     0.065532
      ⋮

pd1 = probdefault(sc1,data_Test);

Set the threshold value that controls the allocation of "goods" and "bads."

threshold = 0.35;
predictions1 = double(pd1>threshold);

Create a fairnessMetrics object to compute fairness metrics at the model level and then generate a metrics report using report.

modelMetrics1 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions1, 'SensitiveAttributeNames','AgeGroup'); 
mmReport1 = report(modelMetrics1)
mmReport1=4×7 table
    ModelNames    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact    EqualOpportunityDifference    AverageAbsoluteOddsDifference
    __________    _______________________    ______________    ___________________________    _______________    __________________________    _____________________________

      Model1             AgeGroup            Age < 30                    0.54312                  2.6945                   0.47391                         0.5362           
      Model1             AgeGroup            30 <= Age < 45              0.19922                  1.6216                   0.35645                        0.22138           
      Model1             AgeGroup            45 <= Age < 60                    0                       1                         0                              0           
      Model1             AgeGroup            Age >= 60                  -0.15385                    0.52                  -0.18323                        0.16375           

Measure accuracy of model using validatemodel.

validatemodel(sc1)
ans=4×2 table
            Measure              Value 
    ________________________    _______

    {'Accuracy Ratio'      }    0.33751
    {'Area under ROC curve'}    0.66876
    {'KS statistic'        }    0.26418
    {'KS score'            }     1.0403

figure
tiledlayout(2,1)
nexttile
plot(modelMetrics1,'spd')
nexttile
plot(modelMetrics1,'di')

Reweight Data at Predictor and Model Level

Use fairnessWeights to reweight the training data to remove bias for the sensitive attribute 'AgeGroup'.

fairWeights = fairnessWeights(data_Train, 'AgeGroup', 'status');
data_Train.FairWeights = fairWeights;
head(data_Train)
    CustID    CustAge       AgeGroup       TmAtAddress    ResStatus     EmpStatus    CustIncome    TmWBank    OtherCC    AMBalance    UtilRate    status    FairWeights
    ______    _______    ______________    ___________    __________    _________    __________    _______    _______    _________    ________    ______    ___________

       1        53       45 <= Age < 60        62         Tenant        Unknown        50000         55         Yes       1055.9        0.22        0         0.95879  
       2        61       Age >= 60             22         Home Owner    Employed       52000         25         Yes       1161.6        0.24        0         0.75407  
       3        47       45 <= Age < 60        30         Tenant        Employed       37000         61         No        877.23        0.29        0         0.95879  
       4        50       45 <= Age < 60        75         Home Owner    Employed       53000         20         Yes       157.37        0.08        0         0.95879  
       7        34       30 <= Age < 45        32         Home Owner    Unknown        32000         26         Yes       717.82        0.02        1         0.82759  
       8        50       45 <= Age < 60        57         Other         Employed       51000         33         No        3041.2        0.13        0         0.95879  
       9        50       45 <= Age < 60        10         Tenant        Unknown        52000         25         Yes       115.56        0.02        1          1.0992  
      10        49       45 <= Age < 60        30         Home Owner    Unknown        53000         23         Yes        718.5        0.17        1          1.0992  

Use fairnessMetrics to compute fairness metrics for the training data after reweighting and use report to generate a fairness metrics report..

trainingDataMetrics_AfterReweighting = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames','AgeGroup','Weights','FairWeights');
tdmrReport = report(trainingDataMetrics_AfterReweighting)
tdmrReport=4×4 table
    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact
    _______________________    ______________    ___________________________    _______________

           AgeGroup            Age < 30                  -2.9976e-15                   1       
           AgeGroup            30 <= Age < 45            -5.5511e-16                   1       
           AgeGroup            45 <= Age < 60                      0                   1       
           AgeGroup            Age >= 60                 -2.9421e-15                   1       

By applying the reweighting algorithm to the AgeGroup predictor, you can completely remove the disparate impact for AgeGroup. Then use this debiased data to fit a model to produce predictions with an overall reduced disparate impact at the model level.

Use creditscorecard to fit a new credit scoring model with the new fair weights and compute model-level bias metrics.

sc2 = creditscorecard(data_Train,'IDVar','CustID', ...
     'PredictorVars',PredictorVars,'WeightsVar','FairWeights','ResponseVar','status');
sc2 = autobinning(sc2);
sc2 = fitmodel(sc2,'VariableSelection','fullmodel');
Generalized linear regression model:
    logit(status) ~ 1 + TmAtAddress + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance + UtilRate
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE        tStat        pValue  
                   ________    ________    ________    __________

    (Intercept)     0.74055    0.076222      9.7158    2.5817e-22
    TmAtAddress      1.3416      0.9108       1.473       0.14075
    ResStatus        2.0467      1.7669      1.1584       0.24672
    EmpStatus       0.91879     0.32197      2.8536     0.0043222
    CustIncome      0.91038     0.33216      2.7407       0.00613
    TmWBank          1.1067     0.30826      3.5901     0.0003305
    OtherCC         0.42264      3.5078     0.12049        0.9041
    AMBalance        1.1347      0.3447      3.2919    0.00099504
    UtilRate       -0.39861     0.77284    -0.51577       0.60601


840 observations, 831 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 46.6, p-value = 1.85e-07
pointsinfo2 = displaypoints(sc2)
pointsinfo2=34×3 table
      Predictors              Bin            Points 
    _______________    _________________    ________

    {'TmAtAddress'}    {'[-Inf,9)'     }    -0.21328
    {'TmAtAddress'}    {'[9,23)'       }     0.07168
    {'TmAtAddress'}    {'[23,Inf]'     }     0.14763
    {'TmAtAddress'}    {'<missing>'    }         NaN
    {'ResStatus'  }    {'Tenant'       }    0.016048
    {'ResStatus'  }    {'Home Owner'   }    0.091092
    {'ResStatus'  }    {'Other'        }     0.28326
    {'ResStatus'  }    {'<missing>'    }         NaN
    {'EmpStatus'  }    {'Unknown'      }    -0.10352
    {'EmpStatus'  }    {'Employed'     }     0.33653
    {'EmpStatus'  }    {'<missing>'    }         NaN
    {'CustIncome' }    {'[-Inf,30000)' }    -0.37618
    {'CustIncome' }    {'[30000,40000)'}    0.047483
    {'CustIncome' }    {'[40000,42000)'}     0.10244
    {'CustIncome' }    {'[42000,47000)'}     0.14652
    {'CustIncome' }    {'[47000,Inf]'  }     0.40015
      ⋮

pd2 = probdefault(sc2,data_Test);
predictions2 = double(pd2>threshold);

Use fairnessMetrics to compute fairness metrics at the model level and report to generate a fairness metrics report.

modelMetrics2 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions2, 'SensitiveAttributeNames','AgeGroup'); 
mmReport2 = report(modelMetrics2)
mmReport2=4×7 table
    ModelNames    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact    EqualOpportunityDifference    AverageAbsoluteOddsDifference
    __________    _______________________    ______________    ___________________________    _______________    __________________________    _____________________________

      Model1             AgeGroup            Age < 30                    0.39394                  2.1818                   0.37391                        0.39377           
      Model1             AgeGroup            30 <= Age < 45             0.094298                  1.2829                   0.22947                        0.11509           
      Model1             AgeGroup            45 <= Age < 60                    0                       1                         0                              0           
      Model1             AgeGroup            Age >= 60                  -0.13333                     0.6                  -0.18323                         0.1511           

Measure accuracy of model using validatemodel.

validatemodel(sc2)
ans=4×2 table
            Measure              Value 
    ________________________    _______

    {'Accuracy Ratio'      }    0.27735
    {'Area under ROC curve'}    0.63868
    {'KS statistic'        }    0.22702
    {'KS score'            }    0.90741

figure
tiledlayout(2,1)
nexttile
plot(modelMetrics2,'spd')
nexttile
plot(modelMetrics2,'di')

The process of reweighting removed all the bias from the training data. When you use the new data to fit a model, the overall bias in the model is reduced when compared to a model trained with biased data. As a consequence of this reduction in bias, there is a drop in model accuracy. You can choose to make tradeoff to improve fairness.

References

[1] Nielsen, Aileen. "Chapter 4. Fairness PreProcessing." Practical Fairness. O'Reilly Media, Inc., Dec. 2020.

[2] Mehrabi, Ninareh, et al. “A Survey on Bias and Fairness in Machine Learning.” ArXiv:1908.09635 [Cs], Sept. 2019. arXiv.org, https://arxiv.org/abs/1908.09635.

[3] Wachter, Sandra, et al. Bias Preservation in Machine Learning: The Legality of Fairness Metrics Under EU Non-Discrimination Law. SSRN Scholarly Paper, ID 3792772, Social Science Research Network, 15 Jan. 2021. papers.ssrn.com, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3792772.

See Also

| | | |

Related Topics