Credit Scorecard Modeling Using Observation Weights
When creating a
creditscorecard object, the table used
for the input
data argument either defines or does not define
observational weights. If the
data does not use weights, then the
Odds are used by credit score card functions. However, if the
WeightsVar argument is specified when creating a
creditscorecard object, then the "counts" for
Odds are the
sum of weights.
For example, here is a snippet of an input table that does not define observational weights:
If you bin the customer age predictor data, with customers up to 45 years old in one bin, and 46 and up in another bin, you get these statistics:
Good means the total number of rows with a
value in the
status response variable.
1’s in the
Odds is the ratio of
Odds is reported for each bin. This means that there are 381
people in the sample who are 45 and under who paid their loans, 241 in the same age
range who defaulted, and therefore, the odds of being good for that age range is
Suppose that the modeler thinks that people 45 and younger are underrepresented in
this sample. The modeler wants to give all rows with ages up to 45 a higher weight.
Assume that the modeler thinks the up to 45 age group should have 50% more weight than
rows with ages 46 and up. The table data is expanded to include the observation weights.
Weight column is added to the table, where all rows with ages 45
and under have a weight of
1.5, and all other rows a weight of
1. There are other reasons to use weights, for example, recent
data points may be given higher weights than older data points.
If you bin the weighted data based on age (45 and under, versus 46 and up) the
expectation is that each row with age 45 and under must count as 1.5 observations, and
Bad “counts” are increased
The “counts” are now “weighted frequencies” and are no longer integer values. The
Odds do not change for the first bin. The particular weights
given in this example have the effect of scaling the total
Bad counts in the first bin by the same scaling factor, therefore
their ratio does not change. However, the
Odds value of the total
sample does change; the first bin now carries a higher weight, and because the odds in
that bin are lower, the total
Odds are now lower, too. Other credit
scorecard statistics not shown here, such as
Information Value are affected in a similar way.
In general, the effect of weights is not simply to scale frequencies in a particular bin, because members of that bin will have different weights. The goal of this example is to demonstrate the concept of switching from counts to the sum of weights.
- Bias Mitigation in Credit Scoring by Reweighting (Risk Management Toolbox)