Linear and nonlinear analyses for detection of sudden cardiac death (SCD) using ECG and HRV signals

Studies show that millions of people throughout the world lose their lives as the result of sudden cardiac death (SCD) each year. These deaths can be reduced by using medical equipment such as defibrillators. However, there is still an urgent need for a suitable way to predict SCD so that the doctors can take proper decisions for patients at risk. In this paper, we investigated a way to predict sudden cardiac death. To do this, we first extract the HRV signal from ECG signal and elicit informative nonlinear and time-frequency features. Then, the dimension of feature space is reduced by applying feature selection and finally, healthy persons and those at risk of SCDs are classified using MLP and KNN neural networks. To evaluate the capabilities of analytical methods in classification, we have compared the classification rates by using both separate and combined nonlinear and TF features. The results show that there are features in the HRV signal of patients prone to SCD before the onset of SCD, which noticeably differ from those of normal people. Another remarkable result to emerge from our analysis is that the combination of timefrequency and nonlinear features have a better ability to detect this evident difference. The proposed method demonstrates that four minutes prior to the occurrence of SCD, the signals of a normal person and one of at risk can be differentiated in an effective and a reliable manner, which in turn, can make possible the provision of timely treatments. *Correspondence to: Elias Ebrahimzadeh, School of Electrical and Computer Engineering, College of Engineering, University of Tehran; Department of Biomedical Engineering, School of Electrical Engineering, Payame Noor University of North Tehran, Tehran, Iran, E-mail: e_ebrahimzadeh@ut.ac.ir


Introduction
Sudden Cardiac Death (SCD), resulted from a precipitous loss of heart function, is a leading cause of cardiovascular mortality in modern socialites. This is a very serious cardiac event that will claim the patient's life within few minutes [1,2]. When this occurs, no blood can be pumped to the rest of the body within minutes in a person with known or unknown cardiac disease.
Despite the significant decline in coronary artery disease (CAD) mortality in the second half of the 20th century, sudden cardiac death (SCD) continues to claim 250 000 to 300 000 US lives annually. In North America and Europe the annual incidence of SCD ranges between 50 to 100 per 100000 in the general population. Due to the lack of emergency medical response systems in most world regions, worldwide estimates are currently not available. However, even in the presence of advanced first responder systems for resuscitation of out-of-hospital cardiac arrest, the overall survival rate in a recent North American analysis was 4.6% [3]. Astonishingly, the victim may not even have been diagnosed with heart disease. Also, the time and mode of death is quite unexpected [4]. Most victims (>90%) have previously known or unrecognized cardiac abnormality [5][6][7][8][9]. These life-threatening arrhythmias that indicate SCD are most often initiated with a sustained ventricular tachyarrhythmia, including ventricular tachycardia (VT), ventricular flutter (VFL), or ventricular fibrillation (VFib). A smaller percentage of SCD events are related to a primary brady arrhythmia [10]. SCD may abruptly strike any person, young or elderly, if they are at high risk of heart disease. Besides utilizing public access defibrillation (PAD) procedure to recue impending death patient after the collapse, a more reliable solution is to prevent onset SCD by adopting medical aid prior to the occurrence of SCD. Thus, it should be made possible to make an early warning around half an hour before the crisis presents itself. [11].
Ichimaru et al. found that the respiratory peak of the heart rate variability (HRV) in SCD patient disappeared during the night time one week before death [12]. Van Hoogenhuyze, D., Martin, et al. observed two HRV measurements, standard deviation of mean of sinus R-R intervals (SDANN) and mean of SD (SD), from 24 hrs HRV. They have evidence to show that HRV is low in patients who experience SCD and is high in young healthy subjects [13]. In our early and encouraging experiments, we showed that the TF method can classify normal and SCD subjects, more efficiently than the classical method [14]. Moreover, we evaluated both TF and Classic methods through the MLP classifier for one-minute ECG signal before SCD by the accuracy of 99.16% and 74.36%, respectively. However, the relationship between short-term HRV and SCD is unknown. In addition, repolarization alter nans phenomenon provides a safe, noninvasive maker for the risk of SCD, and has proven equally effective to an invasive and more expensive procedure -invasive electrophysiological study (EPS), which is commonly used by cardiac electrophysiologists [15,16].
Analysis of heart rate variability (HRV) has provided a noninvasive method for assessing cardiac autonomic control [17]. HRV is accepted like a strong and independent predictor of mortality after an acute myocardial infarction [18], such that a reduced HRV is associated with a higher risk for severe ventricular arrhythmia and sudden cardiac death [19]. In this research, the heart rate variability (HRV) signal is used to study sudden cardiac death (SCD). The common linear features and time-frequency (TF) domain features have been used to predict the sudden cardiac death. It should be noted that different linear methods have so far been applied for analysis of HRV signal. Nevertheless, studies in the literature suggest that classic linear methods fail to predict SCD effectively and reliably [14,10]. Recently, it has been brought to attention that the non-linear processing methods can provide more information than linear methods and can be a good complement for them [20]. In this article, we aim to conduct classic linear, time-frequency and nonlinear analysis on HRV signal of healthy persons and patients prone to SCD. Therefore, we use combinational feature vector and neural classifiers to separate healthy subjects and those at risk. Having extracted HRV signal from ECG signal, we elicited linear features and applied the Wigner Ville transform to obtain time-frequency features. Finally, nonlinear features are extracted. Subsequently, feature dimensionality is reduced through employing combinational feature vector and feature selection methods. In the next stage, Multilayer perceptron (MLP) neural network and K-Nearest Neighbor (KNN) neural network are used to classify healthy persons and those susceptible to SCD. This classification is performed in 4 steps. The separability of each oneminute interval (i.e., the first one minute, the second one minute, the third one minute and the forth one minute before SCD) in prediction of SCD is evaluated through calculating the accuracy. Figure 1 illustrates the block diagram of the proposed method.

Material and methods
The proposed method is evaluated on a database containing 35 patients with sudden cardiac death (including 16  In cases that for each observation (patient), two channels were available, each channel is used as a separate observation (patient).

Preprocessing of ECG signal
The dataset consists of 24-hour ECG recordings (Holter) before the event of heart death and several seconds after that. Patients who show signs of a previous heart attack or having the hard tachyarrhythmia are susceptible to SCD, and finally they succumb to SCD. One-minute time series segments prior to the onset of SCD are separated and named as the first minute, second minute, third minutes and fourth minutes, respectively. Figure 2 shows an electrocardiogram signal of a 34-yearold patient that has led to sudden cardiac death. Before the occurrence of SCD, there is no difference between the ECG signal of person susceptible to heart death and that of a normal person. Figure 3 depicts a sample of ECG signal of a person with SCD several seconds before the onset of SCD and a few seconds after.
One minute before the occurrence of the sudden cardiac death was selected as ECG recordings for patients. For normal subjects one minute of the ECG signal was selected at random. Then, the Pan-Tompkins [21] algorithm was used to detect the QRS-complexes in the ECG-signal from which we could determine the RR-intervals and HRV signal. Thus, the preprocessed HRV signal is now ready to extract features from it. In the Figure 4 & Figure 5 show HRV and ECG signals of a healthy person and those of an SCD subject.

Classical feature analysis
In this step, a number of common linear features are extracted. They include 5 features in the time domain and 4 features in the frequency domain.     2. Standard deviation of all NN intervals (SDNN).

Time-domain features
which reflects all the cyclic components responsible for variability in the period of recording Measurements from the differences between NN intervals: The most commonly used measures derived from interval differences include: 1. The square root of the mean of the sum of the squares of differences between adjacent NN Intervals ( RMSSD).
2. The standard deviation of differences between adjacent NN intervals (SDSD).
3. The proportion derived by dividing the number of interval differences of NN intervals greater than 50 ms by the total number of NN intervals (PNN50) [22].

Frequency domain features
Although the time domain parameters are computationally effective, they lack the ability to discriminate between the sympathetic and parasympathetic contents of the RR intervals. It is generally accepted that the spectral power in the high frequency (HF) band (0.15-0.4 Hz) of the RR intervals reflects the respiratory sinus arrhythmia (RSA) and thus cardiac vagal activity. On the other hand, the low frequency (LF) band (0.04-0.15 Hz), is related to the baroreceptor control and is mediated by both vagal and sympathetic systems [23]. In this work, the LF, HF,VLF and ratio of the LF and HF bands power (LF/HF) is used as the frequency domain features of the RR interval signal [24].
The power spectral density(PSD), shown in Figure 6, was computed using Burg parametric method.
Spatial scattering of two of these features is shown in Figure 7. As can be seen, these features are suitable for discriminating between the two groups, i.e. the healthy and SCD subjects.

Time-frequency (TF) domain analysis
An approach to analyzing non stationary HRV signal is applying time-frequency (TF) methods which can be divided into three main categories: nonparametric linear TF methods based on linear filtering, including the short-time Fourier transform [25,26] and the wavelet transform [27,28], nonparametric quadratic TF representations, including the Wigner-Ville distribution and its filtered versions [29][30][31][32], and parametric time-varying methods based on autoregressive models with time-varying coefficients [33,34]. In this paper the Smoothed Pseudo Wigner-Ville distribution (SPWVD) is preferred since it provides better time frequency resolution than nonparametric linear methods, an independent control of time and frequency filtering, and power estimates with lower variance than parametric methods when rapid changes occur [30]. The main drawback of the SPWVD is the presence of cross-terms, which should be suppressed by the time and frequency filtering. The SPWVD of the discrete signal x(n) is defined by [31].
where n and m are the discrete time and frequency indexes, respectively, h(k) is the frequency smoothing symmetric normed window of length 2N−1 , g(p) is the time smoothing symmetric normed window of length 2M − 1 and r x (n, k) is the instantaneous autocorrelation function, defined as , .
x r n k x n k x n k = + − Figure 8 shows the result of applying Wigner Ville transform to the HRV signal.   The obtained signal in TF domain is also divided into three frequency segments.    Also, we have defined the first order derivative as a feature to show the difference between adjacent windows. This derivative is the difference between the average energy in subsequent windows. This derivative for the first window (first 15 S) was computed by the difference between this window and the last 15 seconds in the second minute. So the first order derivative feature is computed as below The result of features survey in time span of 15 seconds illustrates that in an SCD person, the features changes from one window to next window is much more prominent so we define the first order derivative.

Nonlinear analysis
Without a doubt, the cardiovascular system is more complex than linear systems and also has non-stationary behaviors. There are two non-linear analyses that illustrate chaotic dynamical characteristics in HRV signal and are used for classifying healthy persons and patients prone to SCD. Two different nonlinear parameters of the RR intervals are used in this work, which are described as below.

Poincare plot
When in the RR intervals, each interval RR (n +1) is plotted as a function of previous interval RR(n) , the resulting plot is known as the Poincaré plot, which is a relatively new tool for RR interval signal analysis. Poincaré plot can be seen as a graphical representation of the correlation between the successive RR intervals. This plot can be quantitatively analyzed by calculating the standard deviations of the distances of the points RR(i) from the lines y = x and y = -x +2RRm, where RRm is the mean of all RR(i) values. These standard deviations are denoted by SD1 and SD2, respectively. In fact, SD 1represents the fast beat-to-beat variability, while SD2 describes the relatively longterm variability in the HRV signal [35]. The length (SD2) and the width (SD1) of the long and short axes of Poincaré plot images represent short and long-term variability of any nonlinear dynamic system [36]. We developed mathematical formulations that relate each measure derived from Poincaré plot geometry to well understood existing heart rate variability indexes (Figures 9 and 10) [36]. A strong correlation was found when comparing high frequency power of heart rate signals (modulated by parasympathetic nervous system) to SD1 [37]. SD2 was found to be well correlated with both low and high frequency power (modulated by both the parasympathetic and sympathetic nervous system) [37]. The ratio SD1/ SD2 is usually used to describe the relation between the two components [22,38,39].

Analysis method DFA
Detrended fluctuation analysis (DFA) is a method for quantifying long-range correlations embedded in a seemingly non-stationary time series, and also avoids the spurious detection of apparent long-range correlations that are an artifact of non-stationarity. This method is a modified root mean square analysis of a random walk [22,[40][41][42][43][44][45].

Neural network classifier
To discriminate between ECG of normal person and a person who is prone to sudden cardiac death, The Multilayer perceptron (MLP) neural network and K-Nearest Neighbor (KNN) neural network classifier have been used. Features extracted from HRVs of one-minute intervals (i.e., the first one minute, the second one minute, the third one minute and the fourth one minute before SCD) were compared to normal HRVs of one minute.

Multilayer perceptron
MLP network formed in three layers using error back propagation algorithm with variable learning rate [46][47][48][49][50][51][52]. By changing the number of hidden layer neurons, we have tried to optimize the neural network architecture. The best selection was a three-layer neural network consisting of an input layer, a hidden layer and an output layer. The input layer has a number of nodes equal to the input vector length (7 node). The output layer consists of one node, accounting for a possibility of only 2 classes to be classified. Also, the number of nodes in the hidden layer is 5. Both input and output nodes use linear transfer functions, and the hidden layer uses a sigmoid function.

k-Nearest neighbor
k-Nearest neighbor (k-NN) algorithm is one of the most effective non-parametric methods in pattern recognition [53]. The k-NN algorithm is a method for classifying objects based on their distance to the training examples in the feature space. The k-NN algorithm is among the simplest of all machine learning algorithms. The algorithm is independent from statistical distribution of training examples. There are several distance measures that might be used in this algorithm. However Euclidean distance is commonly preferred as the distance measure. An object is classified by the majority vote of its neighbors, and the object is assigned to the class most common among its k nearest neighbors. The number k is usually chosen small. If k = 1, then the object is simply assigned to the class of its nearest neighbor. The selected feature set is then used to determine the best value of k for the classifier. Therefore, different numbers of nearest neighbors (k = 1, 3,5,7,9,11,13) are tested in the k-NN classifier to obtain the best performance for the classifier [54][55][56][57]. Performances of all classifiers are calculated based on their accuracy. the maximum performance is provided by a 7-nearest neighbor classifier. Network training continued until the mean square error became less than 0.01 or the number of training iterations reached to 1000. Due to the limited input data set, Leave One Out cross-validation method was done for training [45]. At each stage one of observations was selected as test data and 69 as train data, and this process repeated 70 times. Another, words for each experiment use 69 examples for training and the remaining example for testing. Network error in each step was computed, and finally the average was calculated. One advantage of this approach is that all the input data set are present in both processes (train and test) As asynchronous and the network shows it is all capabilities. The same process was done for KNN classifier.
In this stage, firstly for evaluating the separability of features, the extracted features are compared with each other in both individual (linear, nonlinear and time-frequency) and optimal combinational modes. The separability of linear, non-linear and time-frequency features and also combinational mode is calculated three minutes (180s) and two minutes(120s) before SCD and is shown in Table 1.
Wang and et al [10] used 2-minute (just before SCD) of the same dataset to predict SCD. Table 2 shows the results of our method and Wang's method. As it can be seen, the predictive accuracy has been improved from 67.44% to 91.23%.
As can be seen in Table 1, combinational features have more capability in classification of people (i.e., Normal and SCD). That is why combinational features have been used in this study as input feature vector to predict SCD. In this way, HRV signals (before SCD) have been partitioned into one minute intervals. Then, the separability of each one-minute interval (i.e., the first one minute, the second one minute, the third one minute and the forth one minute before SCD) in prediction of SCD is evaluated through the computing accuracy. The obtained results show that the combinational feature vector can predict SCD by the accuracy of 99.73%, 96.52%, 90.37% and 83.96 for the first, second, third and fourth one minute intervals, respectively. The results also denote that the two minutes' interval before SCD contains more information related to the SCD which can be used for prediction. In other words, the first one-minute interval before SCD contains more valuable information for prediction of SCD in comparison with other intervals (i.e., the second, third and fourth intervals), which is expectable from the medical perspective. Also, the ability of combinational feature vector in predicting of SCD is evaluated through the KNN classifier by the accuracy of 81.49%, 88.93%, 95.04%, and 98.32%. Table 3 shows the percentage of separating 4 minutes before the onset.
As it is seen, although there is not a significant difference between ECG of a normal person and that of a patient prone to SCD, by using the proposed combinational feature vector, symptoms of SCD can be observed even 4 minutes before SCD. In other words, in spite of that cardiology & electrocardiography experts cannot distinguish between normal ECG and patients who are prone to SCD, the proposed extracted features can be used to predict SCD. It is highlighted that those intervals which are closer to SCD have more capability for prediction of SCD.

Result
Experimental results show there are significant information in HRV signal which can be extracted through the proposed method and be used for prediction of SCD although there is no difference between normal ECG and those ones which prone to SCD. This study has proposed Average Classification Rate Two-Minutes (120 S before SCD)  Table 3. Average of separating percent between healthy person and patients prone to SCD, 4minute before incident by means of composition vector motion method a new combinational feature vector which contains more valuable information for prediction of SCD in comparison with previous works. The results of this research illustrates that in the electrocardiogram signal of a SCD patient, there are features that are majorly different from healthy person's features. These differences could not be detected by classic methods, in contrast, the time-frequency (TF) methods are shown to be effective in serving the purpose.
Simply put, we have shown that the 2 minutes interval before SCD can be used to distinguish between a person who is prone to SCD and a normal ECG. Also, the third minute interval before SCD carries information presenting high risk of SCD that can be estimated through the proposed method. Moreover, by closing to the SCD, the risk of SCD increases which is expectable from the medical perspective. In the fourth one minute before the onset of SCD, the risk of SCD exists although it has been decreased in comparison with the previous intervals which are closer to the SCD. Generally, healthy and unhealthy persons can be classified by detecting heart attack and tachyarrhythmia before SCD, because patients who show signs of a previous heart attack or having the hard tachyarrhythmia are susceptible to SCD, and may finally succumb to SCD. In this study, we have introduced a new approach which uses a combinational feature vector to predict SCD. It is noticeable that, when approaching the onset of SCD from the fourth interval, the percentage of correct detection of SCD rises dramatically and then climbs sharply for the closest interval to the SCD. Moreover, it is demonstrated that MLP classifier has better performance in detection of SCD than KNN classifier. Finally, our findings about detection of SCD can warn doctors of an imminent SCD 4 minutes before the event, helping them provide timely treatments that save the patient's life.