Modernising Receiver Operating Characteristic (ROC) Curves †
2023 (English) In: Algorithms, E-ISSN 1999-4893, Vol. 16, no 5, article id 253Article in journal (Refereed) Published
Abstract [en]
The justification for making a measurement can be sought in asking what decisions are based on measurement, such as in assessing the compliance of a quality characteristic of an entity in relation to a specification limit, SL. The relative performance of testing devices and classification algorithms used in assessing compliance is often evaluated using the venerable and ever popular receiver operating characteristic (ROC). However, the ROC tool has potentially all the limitations of classic test theory (CTT) such as the non-linearity, effects of ordinality and confounding task difficulty and instrument ability. These limitations, inherent and often unacknowledged when using the ROC tool, are tackled here for the first time with a modernised approach combining measurement system analysis (MSA) and item response theory (IRT), using data from pregnancy testing as an example. The new method of assessing device ability from separate Rasch IRT regressions for each axis of ROC curves is found to perform significantly better, with correlation coefficients with traditional area-under-curve metrics of at least 0.92 which exceeds that of linearised ROC plots, such as Linacre’s, and is recommended to replace other approaches for device assessment. The resulting improved measurement quality of each ROC curve achieved with this original approach should enable more reliable decision-making in conformity assessment in many scenarios, including machine learning, where its use as a metric for assessing classification algorithms has become almost indispensable.
Place, publisher, year, edition, pages MDPI , 2023. Vol. 16, no 5, article id 253
Keywords [en]
decision risks, measurement system analysis, ordinality, rating ability, receiver operating characteristic, Machine learning, Risk assessment, Systems analysis, Classification algorithm, Decision risk, Item response theory, Measurement systems analysis, Quality characteristic, Rating abilities, Receiver operating characteristic curves, Receiver operating characteristics, Specification limit, Decision making
National Category
Civil Engineering
Identifiers URN: urn:nbn:se:ri:diva-64945 DOI: 10.3390/a16050253 Scopus ID: 2-s2.0-85160212265 OAI: oai:DiVA.org:ri-64945 DiVA, id: diva2:1765883
Note Correspondence Address: Pendrill, L.R.; RISE Measurement Science and Technology, Sweden; email: leslie.pendrill@ri.se; Funding details: Horizon 2020 Framework Programme, H2020; Funding details: European Metrology Programme for Innovation and Research, EMPIR; Funding text 1: Part of the work reported has also been part of the 15HLT04 NeuroMET and 18HLT09 NeuroMET2 projects which received funding (2016–2022) from the EMPIR programme co-financed by the Participating States and from the European Union’s Horizon 2020 research and innovation programme. Hence, we would like to express our great appreciation to collaborators and partners for our valuable and constructive work together.
2023-06-122023-06-122023-06-12 Bibliographically approved