On (assessing) the fairness of risk score models
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Standard
On (assessing) the fairness of risk score models. / Petersen, Eike; Ganz, Melanie; Holm, Sune; Feragen, Aasa.
Proceedings of the 6th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023. Association for Computing Machinery, Inc., 2023. s. 817-829 (ACM International Conference Proceeding Series).Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - On (assessing) the fairness of risk score models
AU - Petersen, Eike
AU - Ganz, Melanie
AU - Holm, Sune
AU - Feragen, Aasa
N1 - Publisher Copyright: © 2023 ACM.
PY - 2023
Y1 - 2023
N2 - Recent work on algorithmic fairness has largely focused on the fairness of discrete decisions, or classifications. While such decisions are often based on risk score models, the fairness of the risk models themselves has received considerably less attention. Risk models are of interest for a number of reasons, including the fact that they communicate uncertainty about the potential outcomes to users, thus representing a way to enable meaningful human oversight. Here, we address fairness desiderata for risk score models. We identify the provision of similar epistemic value to different groups as a key desideratum for risk score fairness, and we show how even fair risk scores can lead to unfair risk-based rankings. Further, we address how to assess the fairness of risk score models quantitatively, including a discussion of metric choices and meaningful statistical comparisons between groups. In this context, we also introduce a novel calibration error metric that is less sample size-biased than previously proposed metrics, enabling meaningful comparisons between groups of different sizes. We illustrate our methodology - which is widely applicable in many other settings - in two case studies, one in recidivism risk prediction, and one in risk of major depressive disorder (MDD) prediction.
AB - Recent work on algorithmic fairness has largely focused on the fairness of discrete decisions, or classifications. While such decisions are often based on risk score models, the fairness of the risk models themselves has received considerably less attention. Risk models are of interest for a number of reasons, including the fact that they communicate uncertainty about the potential outcomes to users, thus representing a way to enable meaningful human oversight. Here, we address fairness desiderata for risk score models. We identify the provision of similar epistemic value to different groups as a key desideratum for risk score fairness, and we show how even fair risk scores can lead to unfair risk-based rankings. Further, we address how to assess the fairness of risk score models quantitatively, including a discussion of metric choices and meaningful statistical comparisons between groups. In this context, we also introduce a novel calibration error metric that is less sample size-biased than previously proposed metrics, enabling meaningful comparisons between groups of different sizes. We illustrate our methodology - which is widely applicable in many other settings - in two case studies, one in recidivism risk prediction, and one in risk of major depressive disorder (MDD) prediction.
KW - Algorithmic fairness
KW - Calibration
KW - Ethics
KW - Major depressive disorder
KW - Ranking
KW - Recidivism
KW - Risk scores
U2 - 10.1145/3593013.3594045
DO - 10.1145/3593013.3594045
M3 - Article in proceedings
AN - SCOPUS:85163674977
T3 - ACM International Conference Proceeding Series
SP - 817
EP - 829
BT - Proceedings of the 6th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023
PB - Association for Computing Machinery, Inc.
T2 - 6th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023
Y2 - 12 June 2023 through 15 June 2023
ER -
ID: 359976468