Unitary results:
-
- Our best remediated EBM model produced an AUC of 0.8097 after employing several post-processing techniques such as removing outliers and sensitivity analysis to economic recession conditions. This AUC was also achieved while ensuring a minimum Adverse Impact Ratio (AIR) of 0.8.
- Best training/validation AUC (pre-remediation): 0.8247
Intersectional results:
-
- Among the models explored (EBM, Ensemble, GBM, MGBM, and GLM), we found that the EBM model produced the greatest fidelity to the true outcomes, while maintaining the highest standards of fairness. We compared not only the AUC results to evaluate the models independently but also cross-validated over a number of evaluation metrics such as ACC, AUC, Log Loss, F1, and MSE. Once we determined the superiority of the EBM class model, we selected it as the best model and continued on to remediation techniques.
AUC (pre-remediation) of other alternative models:
-
- Ensemble: 0.8195
- Gradient Boosting Machine (GBM): 0.8183
- Monotonic Gradient Boosting Machine (MGBM): 0.8021
- Penalized Generalized Linear Model (GLM): 0.7628
Partial Dependence Plots:
The partial dependence plot (short PDP or PD plot) shows the “marginal effect one or two features have on the predicted outcome of a machine learning model” (J. H. Friedman 2001).



Global Model Variable Importance:
Global variable importance values give an indication of the magnitude of a variable’s contribution to model predictions for all of the data.

Ethical Considerations
- Although we use the 4/5ths rule, one should aim for full parity where possible in a machine learning model (i.e. 1 to 1 parity in classification)
- Pre-processing remediation techniques should be scrutinized for potential legal issues (e.g. manipulating data with racial class could constitute affirmative action)
- Failure to perform bias testing and remediation of machine learning models can lead to discrimination, which can become self-reinforcing over time
- Our best model underperformed markedly when exposed to economic conditions mimicking a recession, which demonstrates that even the most carefully scrutinized training data can be undermined by shifting real-world conditions
- This model card does not constitute legal or compliance advice
- Further exploration is warranted for our models, but we provide a baseline here
- Additional Reading
All models are wrong, but some are useful – George E. P. Box