Machine learning is data-driven methods where training data is used to estimate general-purpose models for regression and classification, for example neural networks and deep learning. However, there is no systematic way of also including model information based on physical insight about the system to be modeled. Instead, the machine learning methods require large data sets, representing all relevant operation scenarios, to train a model with satisfactory prediction or classification performance [domingos2012few]. Identifying and classifying new system behaviors not covered by training data is almost impossible with existing machine learning methods. Thus, the performance of data-driven models are heavily dependent on the quality of training data. However, in many industrial applications training data is limited or it is not possible to collect data from all relevant cases in advance. One such application is fault diagnosis and prognostics.
Fault diagnosis and prognostics are particularly important in safety-critical systems, e.g. systems where a fault or component degradation can result in human casualties, environmental hazards, or damage to hardware. Fault diagnosis refers to the problem of identifying faulty components and prognostics the problem of predicting component degradation and remaining useful life (RUL). Fault diagnosis and prognosis systems have shown great potential in new industrial services and products, for example in decision-support systems for vehicle fleet predictive maintenance and mission planning that can be used by, for example, transportation industries or computer-aided troubleshooting in vehicle workshops.
Beside data-driven methods, another common approach for fault diagnosis and prognostics is referred to as model-based methods. These methods try to determine the system condition based on a mathematical model, derived based on physical insight of the system. System faults are modeled as changes in physical parameters or variables and by estimating the performance degradation of various components, prognostics methods are used to predict trends to determine when the degradation reach critical levels. To detect and isolate faults in model-based diagnosis, residuals are computed based on different subsets of the physical-based model. An advantage of model-based methods is that no data from faulty conditions is needed to design the residual since it models nominal system behavior. Each residual is designed to detect faults in a certain part of the system, making it insensitive to faults in other parts. By designing multiple residuals based on different subsets of the model it is possible to localize a faulty component by analyzing which residuals have detected the fault.
Since faults are rare events, learning data-driven models for fault diagnosis and prognostics is complicated because of lack of relevant training data. Also, new faults may appear several years into operation which are difficult to predict during the system design phase meaning data training data is not collected from these cases. Especially performance degradation patterns require data from long-term system usage. Hybrid methods, combining physical-based models and training data, is a key to design machine learning algorithms to diagnose faults and predict system degradation even though no training data has been collected from these faults.
The automotive industry has shown interest in hybrid methods for fault diagnosis, e.g. combining vehicle on-board diagnosis with cloud-based big data analysis to improve fault diagnosis and prognosis performance, which is relevant for remote diagnosis and predictive maintenance.
Research in hybrid methods for fault diagnosis and prognostics have increased in recent years, see for example, Prof. K. Pattipatti at University of Connecticut, USA [sankavaram2009model,luo2010integrated], LARIS at Univerité Angers, France [tidriri2016bridging], and Prof. G. Biswas at Vanderbilt University, USA [khorasgani2018methodology]. Research has highlighted the benefits of bridging and combining model-based and data-driven methods [tidriri2016bridging]. There are no previous CENIIT projects related to this project using hybrid methods for fault diagnosis. Hybrid diagnosis system designs have been proposed in, for example, [sankavaram2009model,svard2013automotive]. However, there is still little focus in how to design hybrid methods in a way that fully takes advantage of both physical-based models and data. During the last two years, I have focused my research to analyze this problem which in the last year have resulted in a number of journal publications [jung2018residual,jung2018combining,jung2017combined].
The general objective of this project is to investigate how to develop algorithms for fault diagnosis and prognostics using hybrid methods and covers both theoretical and application-related research questions. In many systems, even though a complete model is not available, there is often some level of knowledge how components are connected. Incorporating physical model information into data-driven models is the key to perform fault isolation and predict component degradation even though training data covering these cases is limited. The main focus of this project will be on analyzing neural networks and deep learning methods because of their ability to model non-linear dynamic systems.
The main approach investigated in this project is to incorporate physical insight into neural networks by using structural methods. Structural methods is a useful tool in model-based diagnosis where a structural representation of the model is used, only describing which variables are included in each equation (also including which equations contain state dynamics) without considering the analytical relationship. Structural methods have made it possible to efficiently work with large-scale models, including e.g. model analysis and residual generation. The latter is performed by identifying computational sequences based on the model equations describing how to evaluate the model equations to compute a residual.
The relations between input signals and state variables described in the computational sequences can help design neural networks inspired from physical insights about the system. One bjective of this project is to investigate how the structural information can be used to achieve certain fault isolation properties by making the neural network insensitive to some faults. Fault isolation is an important step to predict system degradation of a specific component to distinguish the effects from other components. Initial investigations will focus on how structural model information should be used to design the neural network, for example how different signals and state variables should be included in the network, but also how to formulate cost functions to achieve fault isolation capabilities using only nominal training data.
When there is a set of neural networks able to isolate faults, the next objective is to use these for predicting RUL of the different components. Physical-based prognostics estimates a physically interpretable parameter and predicts its future trend. To estimate component degradation, it is useful that the data-driven model is interpretable to map residual outputs, or model parameters, to level of degradation. This is especially important to track the effects of multiple component degradations on the neural networks and predict the different RUL. Thus, the objective includes how to improve interpretability of neural networks by including structural model information in the network design and how this can be used for predicting component degradation.
There are often many ways to design residuals and selecting a suitable subset is called a feature selection problem. This is also a relevant problem to identify which neural network candidates should be designed. In parallel with the other tasks, it is also relevant to analyze if it is possible to identify if the structural model is inaccurate for some parts of the system. One solution is to analyze if parts of neural networks monitoring a certain component have more uncertainties compare to others. This can be used to identify if the structural model needs to be modified to improve accuracy.
Another industrial application problem is to investigate how physical models and fleet operation data can be used for system monitoring an predictive analysis of the individual units using, for example, cloud-based services. In e.g. automotive applications, this is relevant for understanding system degradation patterns and vehicle-to-vehicle variations. The objective is to investigate how to perform prognostics of an individual vehicle by merging component degradation information estimated for that vehicle with logged operational data from a fleet of vehicles.
The research activities are centered around three related topics:
In May 2020, Daniel Jung also recieved his docent degree.
Jung, D. (2019, September). Isolation and Localization of Unknown Faults Using Neural Network-Based Residuals.
In Annual Conference of the PHM Society (Vol. 11, No. 1).
Jung, D. (2020). Residual Generation Using Physically-Based Grey-Box Recurrent Neural Networks For Engine Fault Diagnosis.
arXiv preprint arXiv:2008.04644.
The objective is to develop methods to design grey-box Recurrent Neural Networks (gbRNN) for prediction and residual generation using a structural representation of the system. By designing gbRNN modeling different parts of the system, the idea is to make the gbRNN robust to data from different data classes (here fault modes) without the need of training data from that class. Implementation and training of the gbRNN is done in Python/PyTorch. The design process is summarized into the following steps.
A simulation study considering fault diagnosis of a two-tank system has been used to illustrate and validated the proposed method. A structural model is used as a qualitative descrition of the system dynamics and causality between signals. A set of gbRNN was generated and trained using training data from nominal system operation.
Each gbRNN is used to compute a residual to detect faults. When a fault is detected in data, it is possible to identify the root cause of the fault by matching which model equations that are used to derive the structure of each gbRNN. Simulated data from different fault scenarios showed that the proposed methodology was able to not only detect but also identify the root cause of faults without the need of training data from faults.
The research has continued using real data from an internal combustion engine test bench. Data has been collected during transient operation from different faults, e.g. including sensor faults, leakages, and filter clogging. In this study a model of the air path through the engine is used.
A set of gbRNN was generated from a structural model and trained using nominal data from the test bench.
Validation showed that the trained gbRNN prediction models were able to capture the dynamic behavior of the engine. A set of residual generators were implemented based on the set of gbRNN models.
In ongoing work, investigations focus on physical interpretation of different gbRNN model structures amd understanding how different design choices, when implementing the gbRNN from a structural model, are impacting, e.g. training of the model, stability, and robustness with respect to data uncertainties. The proposed method is also being validiated for fault isolation on other systems in collaboration with Scania.
Jung, D. (2019). Engine Fault Diagnosis Combining Model-based Residuals and Data-Driven Classifiers.
In 9th IFAC International Symposium on Advances in Automotive Control (AAC) (Vol. 52, No. 5, pp. 285-290).
Jung, D. (2020). Data-Driven Open-Set Fault Classification of Residual Data Using Bayesian Filtering.
IEEE Transactions on Control Systems Technology, 28(5), 2045-2052.
Lundgren, A., & Jung, D. (2020). Data-Driven Open Set Fault Classification and Fault Size Estimation Using Quantitative Fault Diagnosis Analysis.
arXiv preprint arXiv:2009.04756.
Lundgren, A. (2020). Data-Driven Engine Fault Classification and Severity Estimation Using Residuals and Data (Dissertation).
Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-165736
Säfdal, J. (2021). Data-Driven Engine Fault Classification and Severity Estimation Using Interpolated Fault Modes from Limited Training Data (Dissertation).
Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173916
Fault classification of technical systems can often be formulated as an open set classification problem. Here, engine fault classification is considered using residual data. Each fault class can have different realizations including a range of fault sizes and excitation due to varying operationg conditions. Small fault sizes are difficult to distinguish from nominal data meaning that data from different fault classes are likely to overlap resulting in classification ambiguities. In addition, it is important to identify unknown faults since these represents new realizations of faults that are not represented in training data.
An advantage of residual data is that system dynamics are filtered out while trying to have sensitivity to faulty data. The idea is that simpler data-driven models are sufficient to model data from each fault class and that it is easier to extrapolate fault data to realizations not covered in training data.
Residual data often are time-series data, and it is likely that a fault is persistent, meaning that data from consecutive samples are likely to belong to the same fault class. Thus, there are advantages of taking temporal information into consideration when classifying faults to improve classification accuracy, especially of small faults.
In a Master's thesis project by Andreas Lundgren, a data-driven classifier was proposed using a distinguishability measure. Fault classes were modeled by analyzing batch data distribution and then measuring dissimilarity with respect to training data using Kullback-Leibler divergence to classify new data batches. The usefulness of the proposed method was illustrated using real data from the engine case study and the experiments showed that the modeling framework could be used for quantitative fault classification performance analysis, data-driven open set classification, but also fault size estimation.
A different open set classification approach was developed by using Bayesian filtering to use temporal information in residual data when classifying samples. Classification accuracy improved while being able to identify unknown faults. An advantage of the Bayesian filtering approach is better real-time capabilities with respect to the distinguishability based approach.
However, both the distinguishability-based and Bayesian filter classification algorithms required training data that were representative of each fault class to accurately identify data from known fault classes. Otherwise, data would be identified to belong to an unknown fault class. This motivated an investigation of how to model fault classes using residual data. Residual data from each faults are projected along some vector in the residual space. This is not new and has been used in parity space methods but also in linear residual generation. Therefore, a fault model is proposed where data from a fault is modeled to lie withing a tube where the distribution around the fault directional vector is modeled as a multi-variate normal distribution where the model parameters are modeled as a Gaussian Process. Initial experiments have shown that the fault model is able to extrapolate to other fault sizes and realizations better than conventional one-class classifiers, such as one-class support vector machines. An algorithm has also been implemented for data-driven fault size estimation when training data contains information about what fault size data were collected from.
In ongoing work, one Master's thesis project is investigating unsupervised fault class clustering using the proposed fault model. Another Master's thesis project is investigating distributed data-driven diagnosis and information fusion.
Jung, D. (2020). Distributed Feature Selection for Multi-Class Classification Using ADMM.
IEEE Control Systems Letters, 5(3), 821-826.
When it is possible to systematically design features, e.g. residuals, that can be used for data-driven classification, feature selection becomes important. However, finding a set of features that can distinguish between different data classes when training data is not representative of all data classes, there is a risk of overfitting. In the case when features consists of residuals, and there is knowledge about which residuals are invariant to different data classes. This information can be incorporated into the feature selection algorithm. Here, a convex formulation of the multi-class feauture selection problem has been proposed. To handle large scale problems, a distributed formulation of the algorithm has been implemented using Alternate Direction Method of Multipliers (ADMM). The algorithm has been validated on data from the engine test bench but also on using the public MNIST image classification dataset. Comparisons with other algorithms showed that similar classification performance was achieved with 17% less features in the MNIST case study than using variable importance and Random Forest.