Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Labelling of Annotated Condition Monitoring Data Through Technical Language Processing
Luleå University of Technology, Sweden.
SKF, Netherlands.
SKF, Netherlands.
RISE Research Institutes of Sweden, Digital Systems, Data Science.ORCID iD: 0000-0002-7873-3971
Show others and affiliations
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We propose a novel approach, technical language labelling, to facilitate supervised intelligent fault diagnosis on unlabelled but annotated industry datasets using technical language processing. Condition monitoring (CM) is vital for high safety and resource efficiency in the green transition and digital transformation of the process industry. Computerised maintenance systems are required to facilitate CM scalability, and learning-based Intelligent Fault Diagnosis (IFD) methods are required to automate maintenance decisions and improve support for human analysts. A major challenge is the lack of labelled datasets from industry and the difficulty of transferring features from labelled lab datasets to unlabelled industry datasets. In this study, we investigate how the fault description annotations and maintenance work orders present in many CM datasets can be understood and used for IFD through Technical Language Processing, based on insights from recent advances in Natural Language Supervision joint pre-training of images and captions. We identify two distinct pipelines, one based on pre-training on large datasets, and one based on a human-centric approach and unsupervised clustering methods to transform annotations into labels, aided by insights from dimensionality reduction and visualisation techniques. Finally, we showcase one example of the small-data fault classification implementation on a CM industry dataset with a Sentence BERT model and conventional signal processing methods. Sets of features are used to overcome data imbalance and label misalignment, and we show that our model can separate sets of cable and sensor fault recordings from sets of bearing-related fault recordings with an F1-score of 92.6%. To our knowledge, this is the first system to create labels for CM data through pre-trained language models without requiring pre-defined taxonomies. 

Place, publisher, year, edition, pages
Prognostics and Health Management Society , 2023. Vol. 15, no 1
Keywords [en]
Accident prevention; Classification (of information); Failure analysis; Fault detection; Large dataset; Maintenance; Natural language processing systems; Signal processing; Condition-monitoring data; Fault recording; Green transitions; High safety; Intelligent fault diagnosis; Labelings; Language processing; Pre-training; Resource efficiencies; Technical languages; Condition monitoring
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-69278Scopus ID: 2-s2.0-85178380051OAI: oai:DiVA.org:ri-69278DiVA, id: diva2:1826264
Conference
15th Annual Conference of the Prognostics and Health Management Society, PHM 2023. Salt Lake City, USA. 28 October 2023 through 2 November 2023
Note

This work is supported by the Strategic innovation program Process industrial IT and Automation(PiIA), a joint investment of Vinnova, Formas andthe Swedish Energy Agency, reference number 2019-02533. T

Available from: 2024-01-11 Created: 2024-01-11 Last updated: 2025-09-23Bibliographically approved

Open Access in DiVA

fulltext(11246 kB)147 downloads
File information
File name FULLTEXT01.pdfFile size 11246 kBChecksum SHA-512
84b7e14fae11bc63ef1febe9cfc6ecb56d56a6059b4752dc0bf0790cb1e9d29c7b940c033e6a6cc419da41e4e2b8de315cc194e98399273830be6b3e4934ff69
Type fulltextMimetype application/pdf

Scopus

Authority records

Nivre, Joakim

Search in DiVA

By author/editor
Nivre, Joakim
By organisation
Data Science
Natural Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 147 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 497 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf