Hybrid Methods for Fault Diagnosis and Prognostics

Participants: Daniel Jung
Project started: 2019

Background

Machine learning is data-driven methods where training data is used to estimate general-purpose models for regression and classification, for example neural networks and deep learning. However, there is no systematic way of also including model information based on physical insight about the system to be modeled. Instead, the machine learning methods require large data sets, representing all relevant operation scenarios, to train a model with satisfactory prediction or classification performance [domingos2012few]. Identifying and classifying new system behaviors not covered by training data is almost impossible with existing machine learning methods. Thus, the performance of data-driven models are heavily dependent on the quality of training data. However, in many industrial applications training data is limited or it is not possible to collect data from all relevant cases in advance. One such application is fault diagnosis and prognostics.

Classification and limited training data

Conventional multi-class classifiers, as illustrated in the left figure, extrapolate the decision boundary. If training data are not representative of all data classes, referred to as open set classification, it is relevant to identify data sets that are significantly deviating from training data to avoid misclassifications and to identify likely unknown classes. The right figure illustrate where data from each class is modeled as a one-class classifier. Then, new data could be classified to belong to one, many or no class.

Fault diagnosis and prognostics are particularly important in safety-critical systems, e.g. systems where a fault or component degradation can result in human casualties, environmental hazards, or damage to hardware. Fault diagnosis refers to the problem of identifying faulty components and prognostics the problem of predicting component degradation and remaining useful life (RUL). Fault diagnosis and prognosis systems have shown great potential in new industrial services and products, for example in decision-support systems for vehicle fleet predictive maintenance and mission planning that can be used by, for example, transportation industries or computer-aided troubleshooting in vehicle workshops.

Beside data-driven methods, another common approach for fault diagnosis and prognostics is referred to as model-based methods. These methods try to determine the system condition based on a mathematical model, derived based on physical insight of the system. System faults are modeled as changes in physical parameters or variables and by estimating the performance degradation of various components, prognostics methods are used to predict trends to determine when the degradation reach critical levels. To detect and isolate faults in model-based diagnosis, residuals are computed based on different subsets of the physical-based model. An advantage of model-based methods is that no data from faulty conditions is needed to design the residual since it models nominal system behavior. Each residual is designed to detect faults in a certain part of the system, making it insensitive to faults in other parts. By designing multiple residuals based on different subsets of the model it is possible to localize a faulty component by analyzing which residuals have detected the fault.

A resdiual is an anomaly classifier that compares sensor data with model predictions to detect inconsistencies.

Since faults are rare events, learning data-driven models for fault diagnosis and prognostics is complicated because of lack of relevant training data. Also, new faults may appear several years into operation which are difficult to predict during the system design phase meaning data training data is not collected from these cases. Especially performance degradation patterns require data from long-term system usage. Hybrid methods, combining physical-based models and training data, is a key to design machine learning algorithms to diagnose faults and predict system degradation even though no training data has been collected from these faults.

Academic and industrial relevance

The automotive industry has shown interest in hybrid methods for fault diagnosis, e.g. combining vehicle on-board diagnosis with cloud-based big data analysis to improve fault diagnosis and prognosis performance, which is relevant for remote diagnosis and predictive maintenance.

Research in hybrid methods for fault diagnosis and prognostics have increased in recent years, see for example, Prof. K. Pattipatti at University of Connecticut, USA [sankavaram2009model,luo2010integrated], LARIS at Univerité Angers, France [tidriri2016bridging], and Prof. G. Biswas at Vanderbilt University, USA [khorasgani2018methodology]. Research has highlighted the benefits of bridging and combining model-based and data-driven methods [tidriri2016bridging]. There are no previous CENIIT projects related to this project using hybrid methods for fault diagnosis. Hybrid diagnosis system designs have been proposed in, for example, [sankavaram2009model,svard2013automotive]. However, there is still little focus in how to design hybrid methods in a way that fully takes advantage of both physical-based models and data. During the last two years, I have focused my research to analyze this problem which in the last year have resulted in a number of journal publications [jung2018residual,jung2018combining,jung2017combined].

Project description

The general objective of this project is to investigate how to develop algorithms for fault diagnosis and prognostics using hybrid methods and covers both theoretical and application-related research questions. In many systems, even though a complete model is not available, there is often some level of knowledge how components are connected. Incorporating physical model information into data-driven models is the key to perform fault isolation and predict component degradation even though training data covering these cases is limited. The main focus of this project will be on analyzing neural networks and deep learning methods because of their ability to model non-linear dynamic systems.

The main approach investigated in this project is to incorporate physical insight into neural networks by using structural methods. Structural methods is a useful tool in model-based diagnosis where a structural representation of the model is used, only describing which variables are included in each equation (also including which equations contain state dynamics) without considering the analytical relationship. Structural methods have made it possible to efficiently work with large-scale models, including e.g. model analysis and residual generation. The latter is performed by identifying computational sequences based on the model equations describing how to evaluate the model equations to compute a residual.

An Artifical Neural Network is a general-purpose black-box model. Thanks to its flexibility it has been applied in many different types of applications.

The relations between input signals and state variables described in the computational sequences can help design neural networks inspired from physical insights about the system. One bjective of this project is to investigate how the structural information can be used to achieve certain fault isolation properties by making the neural network insensitive to some faults. Fault isolation is an important step to predict system degradation of a specific component to distinguish the effects from other components. Initial investigations will focus on how structural model information should be used to design the neural network, for example how different signals and state variables should be included in the network, but also how to formulate cost functions to achieve fault isolation capabilities using only nominal training data.

When there is a set of neural networks able to isolate faults, the next objective is to use these for predicting RUL of the different components. Physical-based prognostics estimates a physically interpretable parameter and predicts its future trend. To estimate component degradation, it is useful that the data-driven model is interpretable to map residual outputs, or model parameters, to level of degradation. This is especially important to track the effects of multiple component degradations on the neural networks and predict the different RUL. Thus, the objective includes how to improve interpretability of neural networks by including structural model information in the network design and how this can be used for predicting component degradation.

There are often many ways to design residuals and selecting a suitable subset is called a feature selection problem. This is also a relevant problem to identify which neural network candidates should be designed. In parallel with the other tasks, it is also relevant to analyze if it is possible to identify if the structural model is inaccurate for some parts of the system. One solution is to analyze if parts of neural networks monitoring a certain component have more uncertainties compare to others. This can be used to identify if the structural model needs to be modified to improve accuracy.

Another industrial application problem is to investigate how physical models and fleet operation data can be used for system monitoring an predictive analysis of the individual units using, for example, cloud-based services. In e.g. automotive applications, this is relevant for understanding system degradation patterns and vehicle-to-vehicle variations. The objective is to investigate how to perform prognostics of an individual vehicle by merging component degradation information estimated for that vehicle with logged operational data from a fleet of vehicles.

Ongoing Activities

The research activities are centered around three related topics:

Grey-box Recurrent Neural Networks for Fault Isolation
Open Set Fault Classification
Feature Selection With Limited Training Data

As part of this project, one PhD student, Arman Mohammadi, has started in January 2021.
There have also been a number of Master's thesis projects contributing to this project: Andreas Lundgren (finished 2020), Joakim Säfdal (finished 2021), Kevin Lindström (ongoing), Ninos Baravdish (ongoing).

In May 2020, Daniel Jung also recieved his docent degree.

Grey-box Recurrent Neural Networks for Fault Isolation

Publications:

Jung, D. (2019, September). Isolation and Localization of Unknown Faults Using Neural Network-Based Residuals.
In Annual Conference of the PHM Society (Vol. 11, No. 1).

Jung, D. (2020). Residual Generation Using Physically-Based Grey-Box Recurrent Neural Networks For Engine Fault Diagnosis.
arXiv preprint arXiv:2008.04644.

The objective is to develop methods to design grey-box Recurrent Neural Networks (gbRNN) for prediction and residual generation using a structural representation of the system. By designing gbRNN modeling different parts of the system, the idea is to make the gbRNN robust to data from different data classes (here fault modes) without the need of training data from that class. Implementation and training of the gbRNN is done in Python/PyTorch. The design process is summarized into the following steps.

Validation of grey-box RNN by comparing model predictions and sensor data.

A simulation study considering fault diagnosis of a two-tank system has been used to illustrate and validated the proposed method. A structural model is used as a qualitative descrition of the system dynamics and causality between signals. A set of gbRNN was generated and trained using training data from nominal system operation.

Recurrent neural network derived from structural model of two-tank system

A two-tank simulation study is used to analyze if it is possible to isolate unknown fault classes. A structural model representing the system is used to generate a set of Recurrent Neural Network prediction models to compute residuals.

Each gbRNN is used to compute a residual to detect faults. When a fault is detected in data, it is possible to identify the root cause of the fault by matching which model equations that are used to derive the structure of each gbRNN. Simulated data from different fault scenarios showed that the proposed methodology was able to not only detect but also identify the root cause of faults without the need of training data from faults.

Residual patterns of a set of RNN-based residuals

A matrix desribing the model support of each residual generator.

A set of RNN-based residuals are generated from the structural model. By analyzing the model support of the residuals that deviate from nominal behavior it is possible to identify the root cause of the fault.

The research has continued using real data from an internal combustion engine test bench. Data has been collected during transient operation from different faults, e.g. including sensor faults, leakages, and filter clogging. In this study a model of the air path through the engine is used.

Structural model of air path through the engine

Engine test bench is used for experiments.

A set of gbRNN was generated from a structural model and trained using nominal data from the test bench.

Example of an gbRNN for the engine case study

A grey-box Recurrent Neural Network designed from the engine model for residual generation.

Validation showed that the trained gbRNN prediction models were able to capture the dynamic behavior of the engine. A set of residual generators were implemented based on the set of gbRNN models.

Validation of grey-box RNN by comparing model predictions and sensor data.

In ongoing work, investigations focus on physical interpretation of different gbRNN model structures amd understanding how different design choices, when implementing the gbRNN from a structural model, are impacting, e.g. training of the model, stability, and robustness with respect to data uncertainties. The proposed method is also being validiated for fault isolation on other systems in collaboration with Scania.

Open Set Fault Classification

Publications:

Jung, D. (2019). Engine Fault Diagnosis Combining Model-based Residuals and Data-Driven Classifiers.
In 9th IFAC International Symposium on Advances in Automotive Control (AAC) (Vol. 52, No. 5, pp. 285-290).

Jung, D. (2020). Data-Driven Open-Set Fault Classification of Residual Data Using Bayesian Filtering.
IEEE Transactions on Control Systems Technology, 28(5), 2045-2052.

Lundgren, A., & Jung, D. (2020). Data-Driven Open Set Fault Classification and Fault Size Estimation Using Quantitative Fault Diagnosis Analysis.
arXiv preprint arXiv:2009.04756.

Lundgren, A. (2020). Data-Driven Engine Fault Classification and Severity Estimation Using Residuals and Data (Dissertation).
Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-165736

Säfdal, J. (2021). Data-Driven Engine Fault Classification and Severity Estimation Using Interpolated Fault Modes from Limited Training Data (Dissertation).
Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173916

Fault classification of technical systems can often be formulated as an open set classification problem. Here, engine fault classification is considered using residual data. Each fault class can have different realizations including a range of fault sizes and excitation due to varying operationg conditions. Small fault sizes are difficult to distinguish from nominal data meaning that data from different fault classes are likely to overlap resulting in classification ambiguities. In addition, it is important to identify unknown faults since these represents new realizations of faults that are not represented in training data.

An advantage of residual data is that system dynamics are filtered out while trying to have sensitivity to faulty data. The idea is that simpler data-driven models are sufficient to model data from each fault class and that it is easier to extrapolate fault data to realizations not covered in training data.

Residual data from different fault classes and various fault sizes

Residuals evaluated on engine data collected from various faulty conditions.

Residual data often are time-series data, and it is likely that a fault is persistent, meaning that data from consecutive samples are likely to belong to the same fault class. Thus, there are advantages of taking temporal information into consideration when classifying faults to improve classification accuracy, especially of small faults.

In a Master's thesis project by Andreas Lundgren, a data-driven classifier was proposed using a distinguishability measure. Fault classes were modeled by analyzing batch data distribution and then measuring dissimilarity with respect to training data using Kullback-Leibler divergence to classify new data batches. The usefulness of the proposed method was illustrated using real data from the engine case study and the experiments showed that the modeling framework could be used for quantitative fault classification performance analysis, data-driven open set classification, but also fault size estimation.

Classification of batch data using distinguishability measure

Classification of batch data from various faults and sizes using distinguishability measure. Each plot at position (i,j) shows that probability of rejecting fault class fj when fi is the present fault as a function of fault size. the propability should be low of rejecting the true fault class but large for all other fault classes.

A different open set classification approach was developed by using Bayesian filtering to use temporal information in residual data when classifying samples. Classification accuracy improved while being able to identify unknown faults. An advantage of the Bayesian filtering approach is better real-time capabilities with respect to the distinguishability based approach.

Bayesian filtering before classification

Classification of time-series data using Bayesian filtering.

However, both the distinguishability-based and Bayesian filter classification algorithms required training data that were representative of each fault class to accurately identify data from known fault classes. Otherwise, data would be identified to belong to an unknown fault class. This motivated an investigation of how to model fault classes using residual data. Residual data from each faults are projected along some vector in the residual space. This is not new and has been used in parity space methods but also in linear residual generation. Therefore, a fault model is proposed where data from a fault is modeled to lie withing a tube where the distribution around the fault directional vector is modeled as a multi-variate normal distribution where the model parameters are modeled as a Gaussian Process. Initial experiments have shown that the fault model is able to extrapolate to other fault sizes and realizations better than conventional one-class classifiers, such as one-class support vector machines. An algorithm has also been implemented for data-driven fault size estimation when training data contains information about what fault size data were collected from.

Modeling fault classes using Gaussian Processes

A fault model is proposed to be able to extrapolate fault classes beyond training data using Gaussian Processes.

In ongoing work, one Master's thesis project is investigating unsupervised fault class clustering using the proposed fault model. Another Master's thesis project is investigating distributed data-driven diagnosis and information fusion.

Feature Selection With Limited Training Data

Publications:

Jung, D. (2020). Distributed Feature Selection for Multi-Class Classification Using ADMM.
IEEE Control Systems Letters, 5(3), 821-826.

When it is possible to systematically design features, e.g. residuals, that can be used for data-driven classification, feature selection becomes important. However, finding a set of features that can distinguish between different data classes when training data is not representative of all data classes, there is a risk of overfitting. In the case when features consists of residuals, and there is knowledge about which residuals are invariant to different data classes. This information can be incorporated into the feature selection algorithm. Here, a convex formulation of the multi-class feauture selection problem has been proposed. To handle large scale problems, a distributed formulation of the algorithm has been implemented using Alternate Direction Method of Multipliers (ADMM). The algorithm has been validated on data from the engine test bench but also on using the public MNIST image classification dataset. Comparisons with other algorithms showed that similar classification performance was achieved with 17% less features in the MNIST case study than using variable importance and Random Forest.

A convex formulation of the feature selection problem for multi-class classification is proposed. A distributed algorithm is developed using ADMM.

References

[domingos2012few] Pedro Domingos.
A few useful things to know about machine learning.
Communications of the ACM, 55(10):78--87, 2012.

[eriksson2013method] Eriksson (Jung), D., Frisk, E., & Krysander, M. (2013).
A method for quantitative fault diagnosability analysis of stochastic linear descriptor models.
Automatica, 49(6), 1591-1600.

[frisk2012diagnosability] E. Frisk, A. Bregon, J. Aslund, M. Krysander, B. Pulido, and Gautam Biswas.
Diagnosability analysis considering causal interpretations for differential constraints.
IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 42(5):1216--1229, 2012.

[jung2015development] D. Jung, L. Eriksson, E. Frisk, and M. Krysander.
Development of misfire detection algorithm using quantitative fdi performance analysis.
Control Engineering Practice, 34:49--60, 2015.

[jung2018residual] D. Jung and E. Frisk.
Residual selection for fault detection and isolation using convex optimization.
Automatica, (97):143--149, 2018.

[jung2016combined] D. Jung, K. Ng, E. Frisk, and M. Krysander.
A combined diagnosis system design using model-based and data-driven methods.
In 2016 3rd Conference onControl and Fault-Tolerant Systems (SysTol), pages 177--182. IEEE, 2016.

[jung2018combining] D. Jung, K. Ng, E. Frisk, and M. Krysander.
Combining model-based diagnosis and data-driven anomaly classifiers for fault isolation.
Control Engineering Practice, (80):146--156, 2018.

[jung2017combined] D. Jung and C. Sundström.
A combined data-driven and model-based residual selection algorithm for fault detection and isolation.
IEEE Transactions on Control Systems Technology, (99):1--15, 2017.

[khorasgani2018methodology] H. Khorasgani and G. Biswas.
A methodology for monitoring smart buildings with incomplete models.
Applied Soft Computing, 2018.

[luo2010integrated] J. Luo, M. Namburu, K. Pattipati, L. Qiao, and S. Chigusa.
Integrated model-based and data-driven diagnosis of automotive antilock braking systems.
IEEE Trans. on SMC-Part A: Systems and Humans, 40(2):321--336, 2010.

[sankavaram2009model] C. Sankavaram, B. Pattipati, A. Kodali, K. Pattipati, M. Azam, S. Kumar, and M. Pecht.
Model-based and data-driven prognosis of automotive and electronic systems.
In Automation Science and Engineering, IEEE International Conference on, pages 96--101, 2009.

[svard2013automotive] Carl Svärd, Mattias Nyberg, Erik Frisk, and Mattias Krysander.
Automotive engine {FDI} by application of an automated model-based and data-driven design methodology.
Control Engineering Practice, 21(4):455--472, 2013.

[tidriri2016bridging] K. Tidriri, N. Chatti, S. Verron, and T. Tiplica.
Bridging data-driven and model-based approaches for process fault diagnosis and health monitoring: A review of researches and future challenges.
Annual Reviews in Control, 42:63--81, 2016.

[voronov2016heavy] S. Voronov, D. Jung, and E. Frisk.
Heavy-duty truck battery failure prognostics using random survival forests.
IFAC-PapersOnLine, 49(11):562--569, 2016.