Performance analysis of out-of-distribution detection on trained neural networksShow others and affiliations
2020 (English)In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025, article id 106409Article in journal (Refereed) Published
Abstract [en]
Context: Deep Neural Networks (DNN) have shown great promise in various domains, for example to support pattern recognition in medical imagery. However, DNNs need to be tested for robustness before being deployed in safety critical applications. One common challenge occurs when the model is exposed to data samples outside of the training data domain, which can yield to outputs with high confidence despite no prior knowledge of the given input. Objective: The aim of this paper is to investigate how the performance of detecting out-of-distribution (OOD) samples changes for outlier detection methods (e.g., supervisors) when DNNs become better on training samples. Method: Supervisors are components aiming at detecting out-of-distribution samples for a DNN. The experimental setup in this work compares the performance of supervisors using metrics and datasets that reflect the most common setups in related works. Four different DNNs with three different supervisors are compared during different stages of training, to detect at what point during training the performance of the supervisors begins to deteriorate. Results: Found that the outlier detection performance of the supervisors increased as the accuracy of the underlying DNN improved. However, all supervisors showed a large variation in performance, even for variations of network parameters that marginally changed the model accuracy. The results showed that understanding the relationship between training results and supervisor performance is crucial to improve a model's robustness. Conclusion: Analyzing DNNs for robustness is a challenging task. Results showed that variations in model parameters that have small variations on model predictions can have a large impact on the out-of-distribution detection performance. This kind of behavior needs to be addressed when DNNs are part of a safety critical application and hence, the necessary safety argumentation for such systems need be structured accordingly.
Place, publisher, year, edition, pages
Elsevier B.V. , 2020. article id 106409
Keywords [en]
Automotive perception, Deep neural networks, Out-of-distribution, Robustness, Safety-critical systems, Anomaly detection, Data handling, Pattern recognition, Safety engineering, Statistics, Supervisory personnel, Detection performance, Different stages, Model parameters, Network parameters, Performance analysis, Safety critical applications, Small variations, Trained neural networks, Neural networks
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-48930DOI: 10.1016/j.infsof.2020.106409Scopus ID: 2-s2.0-85090999955OAI: oai:DiVA.org:ri-48930DiVA, id: diva2:1475783
Note
Funding details: Fellowships Fund Incorporated, FFI; Funding details: VINNOVA; Funding details: 2017–03066; Funding details: Knut och Alice Wallenbergs Stiftelse; Funding text 1: This work was carried out within the SMILE II project financed by Vinnova, FFI, Fordonsstrategisk forskning och innovation under the grant number: 2017–03066, and partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by Knut and Alice Wallenberg Foundation .
2020-10-132020-10-132020-12-01Bibliographically approved