Labelling of Annotated Condition Monitoring Data Through Technical Language ProcessingShow others and affiliations
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]
We propose a novel approach, technical language labelling, to facilitate supervised intelligent fault diagnosis on unlabelled but annotated industry datasets using technical language processing. Condition monitoring (CM) is vital for high safety and resource efficiency in the green transition and digital transformation of the process industry. Computerised maintenance systems are required to facilitate CM scalability, and learning-based Intelligent Fault Diagnosis (IFD) methods are required to automate maintenance decisions and improve support for human analysts. A major challenge is the lack of labelled datasets from industry and the difficulty of transferring features from labelled lab datasets to unlabelled industry datasets. In this study, we investigate how the fault description annotations and maintenance work orders present in many CM datasets can be understood and used for IFD through Technical Language Processing, based on insights from recent advances in Natural Language Supervision joint pre-training of images and captions. We identify two distinct pipelines, one based on pre-training on large datasets, and one based on a human-centric approach and unsupervised clustering methods to transform annotations into labels, aided by insights from dimensionality reduction and visualisation techniques. Finally, we showcase one example of the small-data fault classification implementation on a CM industry dataset with a Sentence BERT model and conventional signal processing methods. Sets of features are used to overcome data imbalance and label misalignment, and we show that our model can separate sets of cable and sensor fault recordings from sets of bearing-related fault recordings with an F1-score of 92.6%. To our knowledge, this is the first system to create labels for CM data through pre-trained language models without requiring pre-defined taxonomies.
Place, publisher, year, edition, pages
Prognostics and Health Management Society , 2023. Vol. 15, no 1
Keywords [en]
Accident prevention; Classification (of information); Failure analysis; Fault detection; Large dataset; Maintenance; Natural language processing systems; Signal processing; Condition-monitoring data; Fault recording; Green transitions; High safety; Intelligent fault diagnosis; Labelings; Language processing; Pre-training; Resource efficiencies; Technical languages; Condition monitoring
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-69278Scopus ID: 2-s2.0-85178380051OAI: oai:DiVA.org:ri-69278DiVA, id: diva2:1826264
Conference
15th Annual Conference of the Prognostics and Health Management Society, PHM 2023. Salt Lake City, USA. 28 October 2023 through 2 November 2023
Note
This work is supported by the Strategic innovation program Process industrial IT and Automation(PiIA), a joint investment of Vinnova, Formas andthe Swedish Energy Agency, reference number 2019-02533. T
2024-01-112024-01-112025-09-23Bibliographically approved