In this paper, we present a novel maintenance concept based on condition monitoring and dynamic maintenance packaging, by showing how to connect the information flow from low-level sensors to high-level operations and planning under uncertainty. Today, condition-based maintenance systems are focused on data collection and custom-made rule based systems for data analysis. In many cases, the focus is on measuring "everything" without considering how to use the measurements. In addition, the measurements are often noisy and the future is unpredictable which adds a lot of uncertainty. As a consequence, maintenance is often planned in advance and not replanned when new condition data is available. This often reduces the benefits of condition monitoring. The concept is based on the combination of robust, dynamically adapted maintenance optimization and statistical data analysis where the uncertainty is considered. This approach ties together low-level data acquisition and high-level planning and optimization. The concept has been illustrated in a context of rail vehicle maintenance, where measurements of brake pad and pantograph contact strip wear is used to predict the near future condition, and plan the maintenance activities.
The problem of finding efficient maintenance and inspection schemes in the case of components with a stochastic life time is studied and a mixed integer programming solution is proposed. The problem is compared with the two simpler problems of which the studied problem is a generalisation: The opportunistic replacement problem, assuming components with a deterministic life time and The opportunistic replacement problem for components with a stochastic life time, for maintenance schemes without inspections.
We have developed a method for statistical anomaly detection which has been deployed in a tool for condition monitoring of train fleets. The tool is currently used by several railway operators over the world to inspect and visualize the occurrence of "event messages" generated on the trains. The anomaly detection component helps the operators to quickly find significant deviations from normal behavior and to detect early indications for possible problems. The savings in maintenance costs comes mainly from avoiding costly breakdowns, and have been estimated to several million Euros per year for the tool. In the long run, it is expected that maintenance costs can be reduced with between 5 and 10 % by using the tool.
Most existing work in information fusion focuses on combining information with well-defined meaning towards a concrete, pre-specified goal. In contradistinction, we instead aim for autonomous discovery of high-level knowledge from ubiquitous data streams. This paper introduces a method for recognition and tracking of hidden conceptual modes, which are essential to fully understand the operation of complex environments, and an important step towards building truly intelligent aware systems. We consider a scenario of analyzing usage of a fleet of city buses, where the objective is to automatically discover and track modes such as highway route, heavy traffic, or aggressive driver, based on available on-board signals. The method we propose is based on aggregating the data over time, since the high-level modes are only apparent in the longer perspective. We search through different features and subsets of the data, and identify those that lead to good clusterings, interpreting those clusters as initial, rough models of the prospective modes. We utilize Bayesian tracking in order to continuously improve the parameters of those models, based on the new data, while at the same time following how the modes evolve over time. Experiments with artificial data of varying degrees of complexity, as well as on real-world datasets, prove the effectiveness of the proposed method in accurately discovering the modes and in identifying which one best explains the current observations from multiple data streams.
This report concerns the "ISC-tool", a tool for classification of patterns and detection of anomalous patterns, where a pattern is a set of values. The tool has a graphical user interface "the anomalo-meter" that shows the degree of anomaly of a pattern and how it is classified. The report describes the user interaction with the tool and the underlying statistical methods used, which basically are Bayesian inference for finding expected or "predictive" distributions for clusters of patterns and using these distributions for classifying and assessing a degree of anomaly to a new pattern. The report also briefly discusses what in general are appropriate methods for clustering and anomaly detection. The project has been supported by SSF via the Butler2 programme.
TIME står för Train Information Management Environment. TIME är ett tänkt övergripande informationssystem för Järnväg. Viktiga aspekter hos TIME är utformningen av en plattform för kommunikation mellan aktörerna i järnvägstransportbranschen och information mellan fordon och system med en fast plats. TIME gäller alla delar i ett informationssystem, hur data produceras och processas, infrastruktur för information och principer för datalagring och informationsutbyte samt funktioner och tjänster baserade på denna information. TIME avser t.ex. att medverka till att samverkan mellan järnvägstransportbranschens aktörer fungerar bra, dessa aktörers egen verksamhet blir effektiv och att kunder till järnvägen och andra som beror av järnvägen erhåller rätt information.
As part of the project DUST financed by Vinnova, we have investigated whether event data generated on trains can be used for finding evidence of wear on train doors. We have compared the event data and maintenance reports relating to doors of Regina trains. Although some interesting relations were found, the overall result is that the information in event data about wear of doors is very limited.
Background: Understanding temporal patterns of organ dysfunction (OD) may aid early recognition of complications after trauma and assist timing and modality of treatment strategies. Our aim was to analyse and characterise temporal patterns of OD in intensive care unit-admitted trauma patients. Methods: We used group-based trajectory modelling to identify temporal trajectories of OD after trauma. Modelling was based on the joint development of all six subdomains comprising the sequential organ failure assessment score measured daily during the first two weeks post trauma. Further, the time for trajectories to stabilise and transition to final group assignments were evaluated. Results: Six-hundred and sixty patients were included in the final model. Median age was 40 years, and median ISS was 26 (IQR 17–38). We identified five distinct trajectories of OD. Group 1, mild OD (n = 300), median ISS of 20 (IQR 14–27), had an early resolution of OD and a low mortality. Group 2, moderate OD (n = 135), and group 3, severe OD (n = 87), were fairly similar in admission characteristics and initial OD but differed in subsequent OD trajectories, the latter experiencing an extended course and higher mortality. In group 3, 56% of the patients developed sepsis as compared with 19% in group 2. Group 4, extreme OD (n = 40), received most blood transfusions, had the highest proportion of shock at admission and a median ISS of 41 (IQR 29–50). They experienced significant and sustained OD affecting all organ systems and a 28-day mortality of 30%. Group 5, traumatic brain injury with OD (n = 98), had the highest mortality of 35% and the shortest time to death for non-survivors, median 3.5 (IQR 2.4–4.8) days. Groups 1 and 5 reached their final group assignment early, > 80% of the patients within 48 h. In contrast, groups 2 and 3 had a prolonged time to final group assignment. Conclusions: We identified five distinct trajectories of OD after severe trauma during the first two weeks post-trauma. Our findings underline the heterogeneous course after trauma and describe some potentially important clinical insights that are suggested by the groupings and temporal trajectories. © 2021, The Author(s).
Linear potential flow (LPF) models remain the tools-of-the trade in marine and ocean engineering despite their well-known assumptions of small amplitude waves and motions. As of now, nonlinear simulation tools are still too computationally demanding to be used in the entire design loop, especially when it comes to the evaluation of numerous irregular sea states. In this paper we aim to enhance the performance of the LPF models by introducing a hybrid LPF-ML (machine learning) approach, based on identification of nonlinear force corrections. The corrections are defined as the difference in hydrodynamic force (vis- cous and pressure-based) between high-fidelity CFD and LPF models. Using prescribed chirp motions with different amplitudes, we train a long short-term memory (LSTM) network to predict the corrections. The LSTM network is then linked to the MoodyMarine LPF model to provide the nonlinear correction force at every time-step, based on the dynamic state of the body and the corresponding forces from the LPF model. The method is illustrated for the case of a heaving sphere in decay, regular and irregular waves – including passive control. The hybrid LPF-model is shown to give significant improvements compared to the baseline LPF model, even though the training is quite generic.
Numerical models used in the design of floating bodies routinely rely on linear hydrodynamics. Extensions for hydrodynamic nonlinearities can be approximated using e.g. Morison type drag and nonlinear Froude-Krylov forces. This paper aims to improve the approximation of nonlinear forces acting on floating bodies by using machine learning (ML). Many ML models are general function approximators and therefore suitable for representing such nonlinear correction terms. A hierarchical modelling approach is used to build mappings between higher-fidelity simulations and the linear method. The ML corrections are built up for FNPF, Euler and RANS simulations. Results for decay tests of a sphere in model scale using recurrent neural networks (RNN) are presented. The RNN algorithm is shown to satisfactory predict the correction terms if the most nonlinear case is used as training data. No difference in the performance of the RNN model is seen for the different hydrodynamic models.
We present a hybrid linear potential flow - machine learning (LPF-ML) model for simulating weakly nonlinear wave-body interaction problems. In this paper we focus on using hierarchical modelling for generating training data to be used with recurrent neural networks (RNNs) in order to derive nonlinear correction forces. Three different approaches are investigated: (i) a baseline method where data from a Reynolds averaged Navier Stokes (RANS) model is directly linked to data from a LPF model to generate nonlinear corrections; (ii) an approach in which we start from high-fidelity RANS simulations and build the nonlinear corrections by stepping down in the fidelity hierarchy; and (iii) a method starting from low-fidelity, successively moving up the fidelity staircase. The three approaches are evaluated for the simple test case of a heaving sphere. The results show that the baseline model performs best, as expected for this simple test case. Stepping up in the fidelity hierarchy very easily introduce errors that propagate through the hierarchical modelling via the correction forces. The baseline method was found to accurately predict the motion of the heaving sphere. The hierarchical approaches struggled with the task, with the approach that steps down in fidelity performing somewhat better of the two.
Dependency derivation and the creation of dependency graphs are critical tasks for increasing the understanding of an industrial process. However, the most commonly used correlation measures are often not appropriate to find correlations between time series. We present a measure that solves some of these problems.
This report describes the gmdl modeling and analysis environment. gmdl was designed to provide powerful data analysis, modeling, and visualization with simple, clear semantics and easy to use, well defined syntactic conventions. It provides an extensive set of necessary for general data preparation, analysis, and modeling tasks.
We explore the possibility of replacing a first principles process simulator with a learning system. This is motivated in the presented test case setting by a need to speed up a simulator that is to be used in conjunction with an optimisation algorithm to find near optimal process parameters. Here we will discuss the potential problems and difficulties in this application, how to solve them and present the results from a paper mill test case.
In many diagnosis situations it is desirable to perform a classification in an iterative and interactive manner. All relevant information may not be available initially and must be acquired manually or at a cost. The matter is often complicated by very limited amounts of knowledge and examples when a new system to be diagnosed is initially brought into use. Here, we will describe how to create an incremental classification system based on a statistical model that is trained from empirical data, and show how the limited available background information can still be used initially for a functioning diagnosis system.
We describe a novel incremental diagnostic system based on a statistical model that is trained from empirical data. The system guides the user by calculating what additional information would be most helpful for the diagnosis. We show that our diagnostic system can produce satisfactory classification rates, using only small amounts of available background information, such that the need of collecting vast quantities of initial training data is reduced. Further, we show that incorporation of inconsistency-checking mechanisms in our diagnostic system reduces the number of incorrect diagnoses caused by erroneous input.
Vi har som en del av det Vinnova-finansierade projektet DUST undersökt hur Bayesiansk statistisk modellering och avvikelsedetektion kan användas för att analysera slitage på hjulprofiler och bromsbelägg på tåg. Vi visar hur man med denna analys kan filtrera data, upptäcka onormalt slitage, och förutsäga när det är dags för underhåll. Resultaten visar att de föreslagna metoderna fungerar mycket bra för analys av den typ av tidsseriedata med trender som det handlar om här, och att det går att få ut ganska mycket trots att data är relativt få och brusiga.
In this report we illustrate how a number of data analysis methods can be used to monitor data from a sensor network. Analysis is made in the forms of visualization, dependency analysis, and anomaly detection. The sensor network is monitored with respect to both the measurements made by the sensors and the operation of the network itself.
The DALLAS ("application of Data AnaLysis with LeArning Systems") project has been designed to bring together groups using learning systems (e.g. artificial neural networks, non-linear multi-variate statistics, inductive logic etc) at five universities and research institutes, with seven companies with data analysis tasks from various industrial sectors in Sweden. An objective of the project has been to spread knowledge and the use of learning systems methods for data analysis in industry. Further objectives have been to test the methods on real world problems in order to find strengths and weaknesses in the methods and to inspire research in the area.
In this paper we describe how Bayesian Principal Anomaly Detection (BPAD) can be used for detecting long and short term trends and anomalies in geographically tagged alarm data. We elaborate on how the detection of such deviations can be used for high-lighting suspected criminal behavior and activities. BPAD has previously been successively deployed and evaluated in several similar domains, including Maritime Domain Awareness, Train Fleet Maintenance, and Alarm filtering. Similar as for those applications, we argue in the paper that the deployment of BPAD in area of crime monitoring potentially can improve the situation awareness of criminal activities, by providing automatic detection of suspicious behaviors, and uncovering large scale patterns.
The need for improving the capability to detect illegal or hazardous activities and yet reducing the workload of operators involved in various surveillance tasks calls for research on more capable automatic tools. To maximize their performance, these tools should be able to combine automatic capturing of normal behavior from data with domain knowledge in the form of human descriptions. In a proposed Joint Statistical and Symbolic Anomaly Detection System, statistical and symbolic methods are tightly integrated in order to detect the majority of critical events in the situation while minimizing unwanted alerts. We exemplify the proposed system in the domain of maritime surveillance.
This paper demonstrates how to explore and visualize different types of structure in data, including clusters, anomalies, causal relations, and higher order relations. The methods are developed with the goal of being as automatic as possible and applicable to massive, streaming, and distributed data. Finally, a decentralized learning scheme is discussed, enabling finding structure in the data without collecting the data centrally. © 2019 Copyright held for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.
The Regina trains, manufactured by Bombardier Transportation, contain software and hardware to generate as well condition data as event data that can be used to monitor the condition of the trains. In this paper we present the necessary equations for abnormality detection of both event data and condition counters in a general setting. The use of the equations is illustrated on authentic data from Regina trains.
I Reginatågen genereras signaler om såväl s.k. "events" (meddelanden om både rutinhändelser och mer eller mindre allvarliga fel i olika enheter) som "condition" (driftsräknare för olika enheter). Det är önskvärt att övervaka båda dessa typer av signaler för att upptäcka avvikelser som kraftigt förändrad frekvens eller driftsintensitet. Sådana avvikelser skulle kunna signalera olika servicebehov, och det skulle alltså vara till användning om servicepersonal fick reda på dem i god tid innan de lett till allvarligare fel. Vi kommer här att gå igenom en grundläggande och generellt användbar statistisk modell för detta scenario. Metoden utvärderas på autentiska data från Reginatågen.
We approach the problem of identifying and interpreting clusters over different time scales and granularity in multivariate time series data. We extract statistical features over a sliding window of each time series, and then use a Gaussian mixture model to identify clusters which are then projected back on the data streams. The human analyst can then further analyze this projection and adjust the size of the sliding window and the number of clusters in order to capture the different types of clusters over different time scales. We demonstrate the effectiveness of our approach in two different application scenarios: (1) fleet management and (2) district heating, wherein each scenario, several different types of meaningful clusters can be identified when varying over these dimensions. © 2019 Copyright held by the owner/author(s).
Denna rapport beskriver resultatet av en dataanalys gjord på produktionsdata från Outokumpu:s valsverk i Avesta. Syftet har varit att fastställa samband mellan övriga produktionsparametrar och uppkomsten av sk. "slivers" en typ av ytlig sprick- eller veck-bildning i det färdiga stålet. Ett annat syfte har varit att studera metodologiska frågor i ett arbete av detta slag.
Discovering causal relations from limited amounts of data can be useful for many applications. However, all causal discovery algorithms need huge amounts of data to estimate the underlying causal graph. To alleviate this gap, this paper proposes a novel visualization tool which incrementally discovers causal relations as more data becomes available. That is, we assume that stronger causal links will be detected quickly and weaker links revealed when enough data is available. In addition to causal links, the correlation between variables and the uncertainty of the strength of causal links are visualized in the same graph. The tool is illustrated on three example causal graphs, and results show that incremental discovery works and that the causal structure converges as more data becomes available. © 2019 Copyright held by the owner/author(s).
In few subjects it is as easy to talk past each other as when discussing consciousness. Not only is the subject elusive and everyone has their own opinion of what it is all about; different people also make quite different use of words and language when discussing consciousness. This contribution tries to exemplify some common misunderstanding between people with different starting points and different use of language. The suggestion is that 'the problem of consciousness' is after all quite similar to all of us, although this is muddled by the way we talk about it, and the way we have locked ourselves into our different slogans and world views.
We describe a style of computing that differs from traditional numeric and symbolic computing and is suited for modeling neural networks. We focus on one aspect of ``neurocomputing,'' namely, computing with large random patterns, or high-dimensional random vectors, and ask what kind of computing they perform and whether they can help us understand how the brain processes information and how the mind works. Rapidly developing hardware technology will soon be able to produce the massive circuits that this style of computing requires. This chapter develops a theory on which the computing could be based.
Maintenance and services of products as well as processes are pivotal for achieving high availability and avoiding catastrophic and costly failures. At the same time, maintenance is routinely performed more frequently than necessary, replacing possibly functional components, which has negative economic impact on the maintenance. New processes and products need to fulfil increased environmental demands, while customers put increasing demands on customization and coordination. Hence, improved maintenance processes possess very high potentials, economically as well as environmentally. The shifting demands on product development and production processes have led to the emergency of new digital solutions as well as new business models, such as integrated product-service offerings. Still, the general maintenance problem of how to perform the right service at the right time, taking available information and given limitations is valid. The project Future Industrial Services Management (FUSE) project was a step in a long-term effort for catalysing the evolution of maintenance and production in the current digital era. In this paper, several aspects of the general maintenance problem are discussed from a data driven perspective, spanning from technology solutions and organizational requirements to new business opportunities and how to create optimal maintenance plans. One of the main results of the project, in the form of a simulation tool for strategy selection, is also described.
Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.