A supervised machine learning algorithm SKVMs used for both classification and screening of glaucoma disease

Glaucoma is the second leading cause of vision loss in the world. We propose a novel, automated, appearance-based glaucoma classification system that does depend on segmentation-based measurements. It applies a standard pattern recognition process with a 2-stage classification step: To automatically extract the optic disc (OD), two methods making use of an edge detection method and Contours active Chan and Vese model are proposed in this paper. For the optic cup (OC) or excavation, inspection by the histogram is used to automatically detect the (OC). Our system SKVMs technique achieves 93% success rate on a data set containing a mixture of 75 real images of healthy and glaucomatous eyes. A set of 75 retinal images obtained from healthy and glaucomatous eyes, is used to assess the performance of the determined CDR to the clinical CDR, and it is found that our proposed method provides 98% accuracy in the determined CDR results and an early screening glaucoma by SKVMs approach that presented the aim of this paper. Abbreviations: AANN: Artificial network of neurons; ANOVA: Analysis of variance; C-CHT: Circular Hough transform; CDR: Cupto-Disc ratio; COG: Concept centre of gravity; Df: Degree of freedom; Dice: The Dice index measures the similarity between two images d1 and d2 based on the number of regions common to d1 and d2; E-EOD: Examination of optic disc; EDA: estimation of distribution algorithms; F-FP: False positive; FN: False negative; FPT: The false positive rate; FBR: The basic radial function; FOV: The field of view; G-GPAO: Primary open-angle glaucoma; Gdx: The GDx is a tool that uses the laser to determine the thickness of the nerve fibre layer; H-HRT: retinal tomography of Heidelberg; HSD: “Honestly Significantly Different” test; IICE: Iridocorneal endothelium syndrome; ISNT: Lower-Upper-Nasal-Temporal; J-Jaccard: The index and the distance of Jaccard are two metrics used in statistics to compare the similarity and the diversity (in) between samples. They are named after the Swiss botanist Paul Jaccard. The Jaccard index or Jaccard coefficient [Jaccard, 1901] is the ratio between the cardinality (size) of the intersection of the sets considered and the cardinality of the union of the sets. It makes it possible to evaluate the similarity between the sets; K-KS: The Kolmogorov-Smirnov test; MMSE: Mean square error; N-NTG: Glaucoma at normal tension; NIR: near infrared light (NIR); NPBS: Collimation to a non-polarizing beam splitter; O-OC: Optic Cup; OD: Optic Disc; OCT: ocular coherence tomography; ONH: Optic Nerve Head; P-PIO: Intraocular pressure; PBS: The polarizing beam splitter; PSG: the polarization state generator; PSD: The polarization state detector; PXF: pseudoexfoliative glaucoma, also called exfoliative glaucoma; RROI: Region of interest; RGB: Red_Green_Blue channels; ROC: The characteristic operating curve of the receiver; RBS: Renal Birefringence Scanning; S-SKVMs: Support kernel vector machines; SNR: Signal-to-noise ratio; Std: Standard deviation; SLO: Scanning Laser Ophthalmology; SLP: Scanning Laser Polarimetry (SLP); TN: *Correspondence to: Rached Belgacem, Department of biophysics, Laboratory of Research in Biophysics and Medical Technologies LRBTM, Higher Institute of Medical Technologies of Tunis ISTMT, University of Tunis El Manar, 9 Avenue Dr. Zouheïr Safi _1006, Tunis, Tunisia, E-mail: rachedbel@gmail.com


Introduction
In this research paper, the aim was to identify, first and foremost, the theoretical generalities concerning the pathology of glaucoma, but also the techniques of digital image acquisition of the retina where the cup-disc excavation is located. the papilla beginning of the head of the optic nerve (Gdx; SLO, OCT ... .Scanning of the retina by a Laser signal); pathology detection parameters; use of digital techniques to limit and segment the area affected by the excavation of the disc which can lead to a total blindness of vision (Digital techniques will be cited later such as CHT), then I worked on a code of the circular Hough transform CHT written by a powerful languages (Matlab and C#) to automatically detect the coaxial contours of the optic nerve head and determine the severity of the excavation from where glaucoma by a ratio Cup / Disc horizontal and vertical automatically determined by a code written in Matlab language. disc segmentation methods were first provided, followed by two optical cup segmentation methods. Finally, the optical disc with optical cup segmentation methods were covered. The main objective was to present some of the current methodologies of detection and segmentation and to give the professional an overview of the existing research. Current trends and challenges as well as future directions for the segmentation of optical disc and optical cup were also discussed.

STARE Dataset:
The Structured Retinal Analysis (STARE) dataset is funded by the National Institutes of Health in the United States. The project has 400 fundus images. Each image is a diagnosis. The blood vessels are annotated in 40 images. The ONH is located in 80 images. A TopCon TRV-50 screen camera with a 35 ° field of view was used to capture images [1].
The optical disc OD is constituted of 1.2 million ganglion cell axons crossing the retina and out of the eye through the scleral channel for transmitting visual information to the brain. Examination of the optical disc can clarify the relationship between the excavation of the optic nerve and visual field loss in glaucoma [2]. The optic disc is divided into three different zones: the neuro-retinal edge, the cup (central zone) and sometimes the parapapillary atrophy [3]. The cup-to-disc ratio (CDR) is the ratio between the vertical diameter of the cup and the vertical diameter of the disc [4].
Various techniques have been used to extract the optic disc (OD), the optic cup (OC), or the optical disc with the segmentation of the optical cup. In this paper, we critically examine the OD and OC segmentation methodologies that automatically detect OD and OC boundaries. These techniques help professionals diagnose and monitor glaucoma by providing clear and accurate information about the structure of ONH. The individuality of this paper is to demonstrate the segmentation methodology by creating a flowchart for each technique. We present the algorithms applied to OD and OC segmentation, discuss the advantages and disadvantages of each method, and provide suggestions for future research.
Ophthalmologists generally acquire different imaging modalities to diagnose ocular pathologies. They include, for example, fundus photography, optical coherence tomography (OCT), computed tomography (CT) and magnetic resonance imaging (MRI). However, these images are often complementary and express the same pathologies in a different way. Certain pathologies are visible only in a particular modality. Thus, it is beneficial for the ophthalmologist to merge these modalities into a single patient-specific model.
The aim of the presented paper is a fusion of numerical and statistical approaches that can be applied to all retinal fundus images from different digital image acquisition modalities. This adds information to the fundus retinal image acquired from fundus photography that was not visible before, such as the vessels and the macula. The contributions of this purpose include the automatic detection of the optical disc, the optical cup, the fovea, the optical axis and an automatic segmentation of the area of the disc and the area's cup [5].
A digital image that represents a scene of the real world (natural image) is cut into a matrix of elementary square cells (i.e., indecomposable) and characterized by a single color called pixel. The processing of these pixels (and more specifically the treatment of the luminance or color associated with each of them) defines what is called computer vision (Figure 1).
The cost of fundus photography continues to be significantly lower than the more recent retinal scanning techniques. Its main advantages I added a script of an automatic contours detection code based on the level-set theory and Snake also written by the Matlab language, which in turn gave altruistic results to automatically calculate the ratio Cup-to-Disc and compare it to the threshold value 'cup / disc = 0.5' and systematically screen a patient with glaucoma or not.
The study of the development and extension of the cup area (Airecup = Airepattern cup) can judge the severity of glaucoma pathology and can still be used as a means of early detection of the disease.
Finally, a SKVMs classification technique was used which first came into contact with other classification methods such as ANN and showed its robustness in judging different classes of glaucoma.
As a result, we thought of introducing a hierarchical method based on: • A circular / curvilinear segmentation which operates at the pixel level to form regions more or less homogeneous in the sense of the gray levels.
• Segment (merge) these obtained regions, by partitioning digital techniques of extraction of regions to form other more significant regions and larger and consistent in the sense of texture.
None of the merged regions really converge towards the deferent objects that can be discerned from the image, the second phase is reiterated several times with a particular adjustment of some regularization parameters, until the achievement of a stopping criterion. The approach was validated on the basis of ophthalmologist expert images on which it was quantified and compared to other existing algorithms.
Problem Statement: Glaucoma represents a significant health problem and is an important cause of blindness worldwide. Examination of the optic nerve head through cup-to-disc ratio is very important for the diagnosis of glaucoma and for monitoring the patient after diagnosis. The images of the optic disc (OD) and the optic cup (OC) are acquired by a fundus camera as well as by optical coherence tomography. Optic disc and optic cup segmentation techniques are used to separate relevant parts of the retinal image and to calculate the cup-to-disc ratio C / D and other features. The main objective of this paper is to review the methodologies and segmentation techniques for the disc and optic cup limits that are used to automatically calculate the geometric parameters of disk and cup with high accuracy to help glaucoma professionals to diagnose and detect pathology using images of the retinal fundus. We provide a brief description of each technique, highlighting its classification and performance measures. Current and future research directions are summarized and discussed.
Determining the cup-to-disc ratio is a very expensive and tedious task currently performed only by professionals. As a result, automated image detection and glaucoma assessment will be very useful. There are two different approaches to automatic image detection of the optic nerve head. The first approach is based on the very difficult process of extracting image characteristics for the binary classification of normal and abnormal conditions. The second, more frequent, approach is based on clinical indicators such as the cup-to-disc ratio as well as the inferior, superior, nasal and temporal areas (ISNT) in the area of the optic disc.
The main contribution of this paper is the introduction of a study of the current methods of segmentation of optical disc and optical cup for the calculation of the CDR and the excavation area used as parameters for automatic and early diagnosis of glaucoma before it reaches irreversible stages resulting in total blindness and loss of vision. Optical are easy interpretation, color (helps to distinguish size and pallor), better detection of disc haemorrhage's, peri-papillary atrophy, etc. The disadvantages are the lack of quantitative description and therefore the inter-observer variability, the highest photographic quality, always easily achievable. Another disadvantage of fundus photography is the need for a high light intensity for retinal illumination, in the order of 10 to 100% of the maximum allowable levels [6], typically delivered by a flash.

Risk factors
Elevated IOP in particular> 26 mmHg.
Positive family history: Incidence increases from 2 to 4 for those with an affected sibling.
Ethnicity: Some ethnic groups have an increased incidence of glaucoma. People of Asian and Inuit (eskimo) origin have an increased incidence of angle-closure glaucoma (20 to 40 times in Inuit), but a low incidence of open-angle glaucoma. People of African descent are three times more likely to develop open-angle glaucoma [7].
Gender: Women are three times more likely than men to develop angle-closure glaucoma because of their shallow anterior chambers.
Prolonged use of steroids.
Conditions that severely restrict blood flow to the eye -for example, diabetic retinopathy, occlusion of the central vein of the retina.

Systemic hypertension.
The emergence of open angle glaucoma is insidious and patients often do not know it. They can have a serious illness despite good visual acuity. Those who have a more advanced disease may be aware of a shadow in their vision or a decrease in visual acuity. However, a normal visual field in one eye may mask the presence of a defect in the affected eye until the disease is advanced enough.
The diagnosis of this silent disease is critical, if missed, the window of opportunity to stop progression may be lost. If the diagnosis is wrong, inappropriate medications can last a lifetime. In some cases, the diagnosis is obvious, especially with secondary glaucoma.
Patients with suspected glaucoma need a thorough eye exam to rule out co-pathology or other possible diagnoses. The ratings are the same for patients with glaucoma and those with -or suspected to have -eye hypertension.

Segmentation by multi-thresholding
Thresholding makes it possible to separate an image into antithetic components by transforming it into a binary image. This implies that the image is separated into white or black pixels depending on whether their intensity value is above or below a certain threshold. The thresholding process can be particularly useful for removing unnecessary detail or variations and highlighting the details of interest. An overall threshold value can be chosen automatically or on the basis of clear points in the image histogram that would allow effective separation. More complex intensity criteria can be used to assign whether pixel values become white or black. For some images, adaptive or local thresholding is useful when different thresholds are applied to different sections of the image, for example, the image at different levels of background lighting.
Keeping in mind human visual perception, extreme pixel values do not need to be finely quantified. By appropriate coarse graining, these can be progressively eliminated from the rest of the pixel values, which must be finely segmented. A recursive implementation produces nonuniform segmentation that naturally allows finer quantization around the mean. This procedure zooms in on the mean in a manner similar to approaching a variety of distributions to the Dirac delta function (Figures 2 and 3).

Extraction of sumptuous features of glaucoma
This study aims to automated optic disc (OD) and optic cup (OC) detection who plays an important role in developing a computeraided system for eye glaucoma diseases. In this paper, we propose an algorithm for OD and OC detection based on structured learning. A classifier model is trained based on structured learning. Then we use the model to achieve the edge contour of OD and OC. Level-set is performed on the edge contour thus; a binary image of the OD and OC is obtained. Firstly, circle Hough transform is carried out to approximate the boundary of OD by a circle. Finally, active contours without gradient applied to the approximate boundary to accurately calculated the edge of the papilla [8,9].
The proposed algorithm has been evaluated on two public datasets one for kids' eyes, the other for adult's eyes and obtained promising results. The results (an accuracy mean of 0.98, and a true positive and false positive fraction of 0.97 and 0.01) show that the proposed method is very competitive with the state-of-the-art methods and is a consistent tool for the segmentation of OD and OC to calculate, automatically, the cup-to-disc ratio CDR and for the extraction of others features to distinguish eye glaucoma diseases [10,11].

Proposed methodology to classify glaucomatous of nonglaucomatous subjects: SKVMs
The proposed methodology implemented here is based on the concept of applying Feature Selection from the edge detection Dataset results (Output) and then classifying normal eyes and abnormal eyes (Glaucomatous subjects) based on enhanced decision (CDR V ≤ or ≥ 0.5).
1. Load an input edge cup and disc detection Dataset.

2.
Apply CHT approach and active contours Chan-Vese approachbased Feature Selection and optimization is done using SKVMs for the Selection of most dependent attributes from the Dataset.
The hole is larger (excavation), corresponding to the loss of nerve fibers.
Early detection and subsequent treatment of glaucoma is hence important as damage done by glaucoma is irreversible. Large scale manual screening of glaucoma is a challenging task as skilled manpower in ophthalmology is low. Hence many works have been done towards automated glaucoma detection system from the color fundus images (CFI). In this paper, we propose a novel method of automated glaucoma detection from CFI using SKVMs approach. Structural features such as cup-to-disk ratio (CDR), cup area (CA) and disk area (DA) of the optic nerve head (ONH) are extracted from CFI using Circular Hough transform (CHT), level-set method, [6] inspection by histogram and morphological processing in order to segment Optic Disk (OD) and Optic Cup (OC) required for calculating the CDR value. The results obtained by the proposed methodology are very promising yielding an overall efficiency of 99% and rate classification of 93 % obtained by SKVMs method to distinguish healthy from glaucomatous eye and in order to assist ophthalmologists.

Materials and methods
To calculate the vertical cup to disc ratio (CDR) along the vertical axis and the horizontal axis, the optic cup and disc first have to be segmented from the retinal images. Figure 4 depicts the framework for building the proposed detection system.

The Cup-to-Disc ratio (CDR)
It evaluates horizontally and / or vertically at the larger diameter of the optical disc and the wider diameter of the excavation in the same axis. It is expressed in tenths (0/ 10 to 10/ 10) or 0.0 (no excavation) to 1.0 (when the excavation is total). It seems more relevant if one wants to keep that value, considering the C / D vertical: in glaucoma, the optic disc was first excavated more vertically than horizontally; and, in case of total excavation, vertical C / D is 10/ 10 ... that can be the horizontal C / D, because of the nasal vascular persistence of the emerging packet [14] ( Figure 5).

Region of Interest (ROI) and centroids detection (C x , C y )
If the Euclidean distance between two centroids is less than a specified threshold ε , these clusters are combined to one cluster.
The new centroid (C x , C y ) is computed as: Where (C x , C y ) is each cluster point and n is the number of points of the cluster.

Optic Disc Segmentation
To detect an optic disc boundary, image pre-processing is introduced.  To remove the blood vessels, a morphological closing operation is performed.
After performing the closing operation, a median filter is applied to further smoothen the obtained image. The outputs of the image preprocessing are shown in Figure 6, 7.
After the image pre-processing is performed, two techniques combined and assembled for extracting a disc boundary are introduced: Circular Hough transform CHT and active contours without edges (gradient) CHAN & VESE Approach.

Circular Hough transform CHT approach
The detection efficiency is enhanced by the discretization of the image and the reduced resolution compared to the circle centre detection process and proves that the centre of the circle is on the gradient line edge point circle; meanwhile, the beam detection accuracy is improved by merging the similar radius within the range of detection process.
The circle Hough Transform (CHT) is a feature extraction technique for detecting circles. It is a specialization of Hough Transform. The purpose of the technique is to find circles in imperfect image inputs. The circle candidates are produced by "voting" in the Hough parameter space and then select the local maxima in a so-called accumulator matrix [15]. The Hough transform can be used to determine the parameters of a circle when a number of points that fall on the perimeter are known.

Pseudo code for feature selection process using CHT
A circle with radius R and centre ( a , b ) can be described with the parametric equations:

Active contours approach without gradient Chan & Vese model
The Active contour method without gradient algorithm has been widely used as a global approach for the optimization of active contours for the segmentation of objects of interest from the background. In this study, this method is employed by initializing a curve centred at the  detected optic disc location. The curve is evolved based on the average intensity value inside and outside the curve. The curve evolution always converges to the optic disc edge irrespective of the shape or size of the initial contour.
The basic idea in active contour models or snakes is to evolve a curve, subject to constraints from a given image, in order to detect objects in that image. For instance, starting with a curve around the object to be detected, the curve moves toward its interior normal and has to stop on the boundary of the object.
Assume further that the object to be detected is represented by the region with the value u i 0 and let denote his boundary by C. Then we have u 0 ≈ u i 0 inside the object (inside C) and u 0 ≈ u o 0 outside the object (outside C). Now let us consider the following" fitting energy", formed by two terms: where C is any other variable curve. We say that the boundary of the object ζ is the minimizer of the F fitting energy: This can be seen easily. For instance, if the curve C is outside the object, then F 1 (C) > 0 and F 2 (C) ≈ 0. If the curve C is inside the object, then F 1 (C) ≈ 0 but F2(C) > 0. Finally, the Fitting energy will be minimized if the C = ζ, i.e. if the curve C is on the boundary of the object.
Therefore, in our active contour model we will minimize this fitting energy and we can add some regularizing terms, like the length of C and/ or the area inside C. We introduce the energy F(C,C 1 ,C 2 ) by: where c 1 and c 2 are constant unknowns, and λ₁, λ₂ > 0, μ > 0, υ ≥ 0 are fixed parameters.
In almost all our computations, we take υ = 0 and λ₁= λ₂. Of-course that one of these parameters can be removed, by fixing it to be 1. In almost all our computations, we take υ = 0 and λ₁= λ₂.
The area term in the energy can be used for instance when we may need to force the curve to move only inside.
Finally, we consider the minimization problem:

Optic Cup Segmentation
Compared to the extraction of the optic disc, optic cup segmentation is more rigid due to a cup inter weavement with blood vessels and surrounding tissues. This study presents two simultaneous steps approaches for cup segmentation, which are the inspection by histogram approach and detection's cup by applying the active contours CHAN & VESE approach.
The histogram is a fundamental image analysis tool that describes the distribution of the pixel intensities in an image. Use the histogram to determine if the overall intensity in the image is high enough for our inspection task. We use the histogram to determine whether an image contains distinct regions of certain grayscale values.
Lack of contrast-A widely-retina image contains a lack of contrast between cup and disc, there's why in our type of imaging application involves inspecting and counting parts of interest in a background of retina. (Figure 10) This region presents our area of interest and contains the optic cup (OC) with maximum intensity (red dot form) and the optic disc (OD) with a more moderate intensity (green dot form).
To separate the two regions and finally detect the Cup we use the threshold of technical inspection by the histogram. ROI by thresholding referring to the peak value calculated we obtain our Cup segmented as shown in the following figures ( Figure 11).
To evaluate the performance of our approach, we used more than 65 fundus images from glaucomatous and non-glaucomatous cases taken from the following database: Site1 : http:/ / cecas.clemson.edu/ ~ahoover/ stare/ The images were acquired with a color analogical fundus camera, approximately centred on the ONH and they were stored in slide format. In order to have the images in digital format, they were digitized using a HP-PhotoSmart-S20 high-resolution scanner, RGB format, resolution 600x400 and 8 bits/ pixel.
To assess classifier performance, it is necessary to quantify the sensitivity, specificity and accuracy.
In glaucomatous classification problem, sensitivity measures the accuracy of the classifier to identify glaucoma in the set of fundus images, and specificity measure the accuracy of classifiers for identifying healthy people in the set ( Figure 12).
The average value of the specificity and the sensitivity using our approach to detect glaucomatous is 99%. At this point, the set of 75 test images are processed using the approach outlined earlier in order to obtain the CDR value, CDR Automated, and the area of the optic cup (excavation), Area Automated .
Then applying different parameters for assessing the diagnostic of glaucoma, we obtain the compared results prepared in the following Tables 1-4, Figure 13.
The correlation between CDRv and CDR H is quite strong, whereas the correlations between the different others features two by two is almost total and strong (the correlation coefficient r is almost equal to 1) as mentioned in the following Table 2

Evaluation study
Our objective is to show that our multimodal evaluation method is effective, not to validate any method used here, so the methods used are not significant.
The methods chosen to carry out the study are: 1. Method of detecting the papilla by the circular Hough transform, (CHT).

2.
Parametric classification using an active contour level set method compared to a manual classification performed by an ophthalmologist. Field model, (Chan_Vese model / Manual method).

Segmentation and merge segmentation, (SF).
As we said, in this work we use images from the bottom of the retina. The volume used has been pre-processed to eliminate noise and uninteresting areas.
We show in Figure 16, the segmentation results for a retinal background image, using the red channel for RGB. Overlap between OD manual and OD automated is represented in gray and square dots. There is no overlap between manual OC and automated OC in the region of interest ( Figure 16-18).
For the evaluation of automatic segmentation of the disc and excavation 20 retinal images were both manually and automatically segmented. To evaluate the accuracy, the commonly used DICE similarity coefficient [16] was measured between manual and automatic segmentation calculated with different approaches. Similarly, Jaccard's index [17,18] has been calculated. The coefficients DICE and Jaccard are respectively defined as:      Table 1 to determine its mean values for a CI confidence interval = 95% Where A∩B denotes the intersection between A and B, and A∪B denotes the union between A and B, and JC, DM∈ [0,1]. A higher value of JC or DM indicates a closer match between the manually delimited reference and the automatically segmented results.
Overall, the DICE coefficient resulted in 0.975 ± 0.005. The deviations from manual segmentation were mainly in the part using the CHT approach. Segmentation takes less than 2 seconds on a laptop.

Distinguish of glaucomatous eye from healthy eye by applying a kernel support kernel vector machines SKVMs A supervised automatic learning algorithm SKVMs
The suspect stage is important because a patient will receive a warning and treatment before the excavation progresses and has symptoms such as headaches due to abnormal pressure inside the eyeball. In the clinic, the intraocular pressure is tested first. After that, an image of the fundus is taken to observe certain abnormalities in the retina. This provides important information to extract, such as the shape and asymmetry of the optical disk (OD), the size and depth of the optical cup, the vertical cup-to-disk ratio CDRv, the fibre layer anomalies. nervous and peripapillary atrophy. If some anomalies appear, the loss of visual field is determined. This can appear in one or both eyes. These abnormalities can be caused by many factors, but glaucoma is one of the risk factors that damages ONH and gradually leads to vision loss. In our hospitals, there is a shortage of ophthalmologists, technicians, health care workers and early treatment.
This system would help narrow the gap between these problems by providing an automatic screening system to diagnose the disease based on a supervised learning technique.
For the supervised learning technique, a characteristic of the target class must be extracted in order to generate a decision function or a model to classify each stage of the disease. In this work, segmentations OD and OC are considered. There are several techniques provided in previous work.
Correlation or similarity of characteristics should be evaluated to reduce redundant functionality. The dimension of characteristics can be reduced by techniques such as principal component analysis and linear regression. For the classification part, the classifiers that are normally used include K-mean, Cmeans fuzzy clustering [19], Bayesian technique, neural network (NN) [20,21], Support vector machine (SVM), [22,23]. Of these, NN and SVM provide high performance and robustness for higher classification sizes.
The ratio of OC to OD in the vertical direction (CDRv) is considered an important feature to check the abnormality of a retina using a fundus image.
In addition, the Rim-Disc 3 report was also proposed to be considered for special cases with OC and OD but the tissues of healthy rim. Using only CDRv as a threshold to indicate glaucoma and non-glaucoma linearly is inadequate because there are overlapping values, which must be analysed in a higher dimensional space. For example, CDRv at 0.65 as the cut-off for glaucoma and non-glaucoma provides 80% accuracy, or 21 false negative (FN) and 3 false positive (FP) cases ( Figure 19).
In this paper, we use the OC and OD diameters and the CDR in the vertical (CDRv) and horizontal (CDR H ) directions as well as other features such as the excavation area. There are two case studies. In Case Study A, a comparison between previous work [23] and the proposed techniques is discussed. The FN and the FP must be reduced, therefore, the SVM is introduced into a classification process, diagnosed in two classes: healthy and glaucoma.
In case study B, the suspect stage of glaucoma is added. This is the step between normal classes and glaucoma classes. Early detection can be detected based on this suspicious class.
The number of false detections is also an important parameter for analysing the performance of the classification. At the polynomial degree three, the SVM kernel function is selected to generate the decision module that can reduce the number of false detections.

The SVM classification technique
The classification technique is widely used for prediction based on known characteristics from the database. An SVM technique is selected as a classifier to find a decision function. It can generate an adaptive decision limit, based on the distribution of selected information or features such as the CDR which is calculated from the ratio of the OC to the OD. Feature selection is required to use the kernel function to find the hyper optimal plan for separating the two classes.
First, the binary classification in Case Study A is described in detail. An SVM is widely used to classify an entity into a large feature space.
It provides several types of kernel functions (decision limits) such as linear, polynomial and radial basic functions. These kernel functions have different characteristics and kernel selection depends on the distribution of input information. An SVM with a linear kernel function is selected for Case Study A. SVM transforms two-dimensional input features into a higher dimensional feature space and maximizes sample distances (support vectors) from the plane hyper decisional. To calculate the maximum margin of the support vectors, the core function is represented by the Euclidean internal product [24,25].
Here are the expressions of two selected kernel functions, K L (x, y) = x T y for the default linear kernel and K P (x, y) = (x T y + c) d for the polynomial kernel, where c ≥ 0 and d are parameters that can be adjusted to find the most efficient kernel function. The cost factor for the linear and polynomial kernel is fixed at 1. First, 9 features are extracted from 75 training samples. They are separated into two classes and labeled 0 for the target class and 1 for the others. An SVM is used and a classifier is generated. The K-fold cross-validation technique is selected to test accuracy. K is set to 10, then K -1 divisions are driven. The test set randomly selected 10% of the dataset. The remaining 90% is assigned as training data. After the first ten percent has been tested, the following test data is modified to form a new, non-overlapping set. This procedure is repeated ten times until the last test. Then, global errors are accumulated and described in a matrix confusion.
Secondly, the multiple classification in case study B is described by introducing an SVM against rest and unbalanced decision tree with SVM in order to overcome a limitation of the traditional SVM, which is effective only in binary classes.
The multi-class SVM is used to distinguish three different data classes, 20 normal samples, suspect 15 samples, and glaucoma 40 samples. The following paragraph describes the classification model for each technique. The highest score is chosen, setting 0 for correct prediction and 1 for incorrect prediction.
The following vector shows a set of input characteristics used in the previous case study.

Why SKVMs?
SVM is an automatic classification method [26,27] that directly minimizes classification error without requiring a statistical data model. This method is popular because of its simple implementation and its consistently high classification accuracy when applied to many real classification situations. The SVM algorithm can be applied to both classification and regression (model adjustment) issues. In the classification, an SVM classifier can separate the data (for example, CDR calculation results from healthy and glaucomatous eyes) that are not easily separable into the original data space (i.e., two-dimensional x, y) by mapping the data into a larger dimension space. SVM uses a kernel function to find a hyperplane that maximizes the distance (margin) between the two classes (e.g., healthy eyes versus glaucomatous), while minimizing the training error [28]. The resulting model is scattered, relying only on a few training samples (the "Support Vectors"). The number of support vectors increases linearly with the available training data, [29] requiring much higher computational complexity when classifying very large data sets (for example tens or hundreds of thousands of variables).
SVM was implemented using Platt's minimal sequential optimization algorithm in commercial software (MathLab, version 5.0, The MathWorks, Natick, MA). For the classification of CDR data, Gaussian (non-linear) cores of different widths were tested, and a Gaussian kernel of width = √ (2 × number of input variables) was chosen to give the largest area under the ROC curve. Using cross validation 10 times. The penalty for the error margin / C margin was 1.0.

Application of SKVMs
This linear classifier determines a hyper-plane with maximum and soft margin that best separates the classes considered. The data is normalized and transformed via the nonlinear radial base kernel.
We use the ʋ-SVM with the penalty parameter ʋ = 0.5 and the cost parameter c = 1 [38].
Classifiers: The ability of each image-based feature extraction method to distinguish normal eyes from those with glaucoma is quantified by the results of three classifiers. Classifiers perform well if their underlying separation model matches the distribution of the sample data. As the distribution of the underlying data is unknown, we have tested different classifiers and we use in this article the support vector machine as a linear classifier [39,40].
The distribution of attribute data may not optimally match the data model of the classifiers. We analyse the effect of two known methods to improve the result of the classification [41,42].
The two-step classification applies the glaucoma class probability score, obtained from each of the four classifiers, as a new feature vector entering another classifier [43,44].

Support kernel vector machines SKVMs
This linear classifier determines a maximum-margin and soft hyperplane that best separates the considered classes. The data is normalized and transformed via the non-linear radial basis kernel.

Pseudo code for feature selection process using levelset based SKVMs
2-stage classification applies the probability score of belonging to the glaucoma class, obtained from each of the four classifiers, as new feature vector Input to another classifier [50,51].

Methodology using SKVMs
The features are first encoding into a bit string S=CDRv 1 , CDRv 2 …. CDRv n , n=1,2…m and the bit {1} represents for the selected feature from the dataset and the bit string {0} is the non-selected feature from the dataset. The evaluation parameters can be computed using SKVMs. Let us suppose that in the dataset the accessible feature set is 65 then set {CDRv 1 CDRv 2 CDRv 3 …. CDRv 65 } is then analysed using SKVMs algorithm and selection of any number of features say 65 a dimensional evaluation of these 65 features is computed using SKVMs. Each feature is renewed using adaptive computation of SKVM, hence based on which pbest is chosen. Now for the final feature selection each of the vector is then updated according to operation [52,53].
If .
The renewed features are then calculated using Eq.12 and hence on the basis of renewal calculation of 'S' and depending on the previous value of 'S' the features are selected as {1} otherwise {0} means the feature is not selected.
The random feature selected assumes to be the best attribute of the dataset and so is the qualification value as best and selection of features starts from this feature of the dataset.
The feature to be selected moves along 'X' and 'Y' axis for the next best feature from the dataset depending upon the qualification value. Hence, initialize the Input parameters of SKVMs.
The selection of features starts with the basic input to SKVMs as the training values and class index.
On the basis of Training Parameters as (trnX, trnY, tstX, ker), Selection of 'Y' as the features values can be predicted.

Binary Classification
Given training data (x i , y i ) for i = 1. . . N, with xi ∈ R d and y i ∈ {−1, 1}, learn a classifier f(x) Such that: i.e. y i .f(X i ) > 0 for a correct classification.

Linear separability
A linear classifier has the form: ( ) For example, X 1 = CDR V and X 2 = CDR H.
• In 2D the discriminant is a line • x i is the normal to the line, and b the bias • w is known as the weight vector For a K-NN classifier it was necessary to 'carry' the training data and for a linear classifier, the training data is used to learn w and then discarded. Only w is needed for classifying new data [56,57] (Figure 20).
Given linearly separable data x i labelled into two categories y i = {-1,1}, find a weight vector w such that the discriminant function: f(x i )=w T x i +b separates the categories for i = 1, and to find this separating hyperplane, we proceed with the perceptron Classifier: Write classifier as 0 ( )

Initialize
Cycle though the data points { , }

If xi is misclassified then
Until all the data is correctly classified [58,59].
Since w T x i +b=0 and c(w T x i +b)=0 define the same plane, we have the freedom to choose the normalization of w .
Choose normalization such that w T x + +b=+1 and w T x -+b=-1 for the positive and negative support vectors respectively. Then the margin [30] [31] is given by: Learning the SVM can be formulated as an optimization: [60,61] Or equivalent

Comparison of SKVMs and ANN: Support vectors of machines SVM against ANN artificial neural networks
The development of ANNs has followed a heuristic path, with applications and extensive experimentation preceding the theory. On the other hand, SVM development involved sound theory first, followed by implementation and experiments. An important advantage of SVM is that if ANN can suffer from several local minima, the solution to an SVM is global and unique. Two other advantages of SVM are that they have a simple geometric interpretation and give a sparse solution. Unlike ANNs, the computational complexity of SVMs does not depend on the dimensionality of the input space. ANNs use empirical risk minimization, while SVMs use structural risk minimization. The reason that SVMs often outperform ANNs in practice is that they deal with the biggest problem with ANNs, SVMs are less likely to adjust.
"Most often, Gaussian kernels are used, when the resulting SVM corresponds to an RBF network with Gaussian radial basic functions, because the SVM approach solves" Automatically "the problem of network complexity, the size of the hidden layer The concealed neurons and the support vectors correspond to one another, so that the central problems of the RBF network are also solved because the support vectors serve as basic function centres. '' Horváth (2003) in Suykens et al. ''

Results given by application of ANNs
Conventional two-layer neuron networks with a single output neuron have been used for the development of the ANN model ( Figure  21). [62] Following network learning, a decision function is selected from the family of functions represented by the network architecture. This family of functions is defined by the complexity of the neural network: number of hidden layers, number of neurons in these layers and topology of the network. The decision function is determined by choosing appropriate weights for the neural network. Optimal weights generally minimize an error function for the particular network architecture. The error function describes the deviation of the predicted target values from the observed or desired values. For our class / non-class classification problem, the target values were 1 for class (glaucomatous eye) and -1 for no classes (healthy eye). A standard twolayer neuron network with a single output neuron can be represented by the following equation: ( 2) (1) (1) The learning of the neural network is typically performed on gradient descent-based algorithm variants, [63] attempting to minimize an error function. To avoid overloading cross validation can be used to find a training point earlier. In this work, the SPSS neural network toolbox was used [64]. The data were pre-processed identically to SVM-based learning. We applied the following training algorithms to ANN optimization in their default versions provided by MATLAB: gradient descent with variable learning rate, conjugate gradient descent, conjugate gradient descent, quasi-Newton algorithm, [65,66] Levenberg-Marquardt (LM), [67] and automated regularization. For each optimization ten times cross-validation was performed (80 + 20 splits in training and test data), where the weights and biases of RNA were optimized using the training data, and the prediction accuracy was measured using test data to determine the number of training periods, i.e., the end point of the training process. training. This has been done to reduce the risk of over-learning. It should be noted that the validation data have not been affected (Tables 4 and 5).

Predicting target values of test data by the SKVMs model
A classification task usually involves separating the data into training and test sets. Each instance of the training set contains a target value (i.e., class labels) and several attributes (i.e., observed characteristics or variables). We used a vector support machine classifier (SVM), a supervised learning model, to classify the normal eye fundus from a fundus affected by glaucoma. The purpose of SVM is to produce a model (based on the learning data) that predicts the target values of the test data by giving only the attributes of the test data [68]. In our case, modified input image attribute sets after applying pre-processing techniques in the previous steps serve as test data.
More formally, the linear SVM classifier function can be defined as, f (x) = w T x + b so that for each learning sample x i the function gives f (x i )> 0 for y i = +1, and f (xi) <0 for y i = -1. In other words, the learning samples of two different classes are separated by the hyperplane f (x) = w T x + b = 0, where w is vector of weight and normal to hyperplane, b is skew or threshold and x i is the data point.
The nonlinear SVM classifier is defined as: ( ) The transformation of a non-linear separation hyperplane into a linear one in a larger feature space is done using the functions of the kernel. A two-sample kernel function, represented as feature vectors in an input space, is defined by: Where Ф is the feature vector. The most commonly used kernels are: Linear kernel: ( , ) Polynomial kernel : ( , ) ( ) Where r is a free parameter exchanging the influence of the higher order and lower order terms in the polynomial, d is the degree of the polynomial, slope γ > 0.

Radial basis function (RBF) kernel K:
Where σ > 0 is an adjustable free parameter; a high value of σ means that the kernel is a "flattened" Gaussian and that the decision limit is "smoother"; a low value of σ makes the Gaussian kernel a "sharper" peak, and so the decision limit is more flexible.
A major advantage of the SVM classification is that SVM works well on datasets that have many attributes, even when there are only a few cases that are available for the training process. However, several disadvantages of the SVM classification include speed and size limitations during the training and testing phase of the algorithm and the selection of kernel function parameters.
A limitation of our study was the small sample size. This may affect the results when using the nine Gdx or SLO print data parameters. As mentioned earlier, complex machine classifiers that use many input parameters tend to work better in larger datasets. A more in-depth survey with a larger number of participants is currently underway.
In summary, machine classifiers of Gdx measurements can provide a simple and accurate index for diagnosing the presence or absence of glaucoma as well as its severity. Classifiers who used a limited number of parameters gave the best ability to discriminate. A classification system for the severity of glaucoma has been developed. A long-term prospective study is needed to determine the utility of this classification index in evaluating glaucoma progression, relative to existing parameters.

Contours Fitting for Optic Disc and Optic Cup
The active contours Chan & Vese algorithm can be used to find the fitting contours to disc and cup boundary.
Following the separation of the two parts of the Cup and the disc (Figure 22-27), would be asked to calculate the ratio of the Cup / Disc in terms of surface and vertically and horizontally with reference to the centroids.
Figure also shows other parameters be automatically extracted as the area of the excavation, which has evolved over time, helps the ophthalmologist specify the severity of retinal disease (Figure 22).
Since CDR is an important indicator used for glaucoma detection, this metric is chosen to evaluate our results. The CDRs (vertical & horizontal) are computed from the obtained cup and disc diameter from the chosen method.

Primary SKVMs Results
If the data is linearly separable, then the algorithm will converge.
• Convergence can be slow.
• Separating line close to training data.
• We would prefer a larger margin for generalization.
For the best of w, we can choose the maximum margin of solution that is most stable under perturbations of the inputs [69][70][71][72]. Figure 28 displays three categories of eyes: A suspect glaucomatous subjects with vectors x i neighbouring the hyperplan, healthy eyes with a group of vectors situated in the left (away) the hyperplan and finally, a group of vectors x i with a high value of CDR (Near to 1) represents a glaucomatous eye situated on the right of the hyperplan (True positive TP).
The classification performance using each feature extraction method separately shows that the accuracy varies between 65% and 95% in cross-validation. In addition, each feature extraction method itself has varying classification accuracy and F-measures for the different classifier configurations [73,74]. The SKVMS separates the features most robustly and is always part of configurations labelled with the "finest"-criterion. The configurations with "finest"-criterion achieve CDR-measures between 0.30 and 0.50 for healthy case and between 0.53 and 0.99 for glaucomatous case in case of cross-validation. They are always using SVM for classification.
In case of the feature merging, the highest success rate and CDRmeasures are obtained if a feature selection is done before using the SVM in case of cross-validation. In 2-stage classification, the classprobabilities of the "best"-labelled classifier configurations are used as second stage features.  As stated in SKVMs, results observers achieve an average CDR glau = 0.59 and CDR heal = 0.35 by qualitative assessment of optic nerve head stereo-photographs (25 healthy and 40 glaucomatous subjects). Regarding classification on separate test and training set, we gain a slightly inferior performance (CDR glau = 0.77) while we get CDR healthy = 0.32 for healthy eye [75,76].
One of the contributions of this paper is to show that sparing kernel combinations can be learnt in a tractable way using Support Kernel Vector Machines to distinguish normal eyes from those with glaucoma [77]. Consider, for example, the learnt patterns of finely for problems like the ones in Figure 27 corresponding to recognition results in Figure  21. Solutions of this form -a set of different kernels for each problem, but with good overall classification accuracy -are not easy to obtain using any of the algorithms currently used in object recognition. An SKVMs necessary method faces a combinatorial problem and no simple kernel enumeration technique can solve it optimally. It is not surprising that learning kernels produces competitive state-of-the art classifiers, neither that a rare combination may sometimes marginally hurt performance -this is a small price to pay for the benefit of compactness and selection. SKVMs provide a scalable solution for combining large numbers of kernels with heterogeneous feature spaces, where a-priori weighting intuitions may no longer be available [78]. 80 images are used for the training and 10 images are used for the tests each time. This process is repeated 10 times using different sessions of the test data each time. The performance of the classifier can be tested and evaluated by the following parameters: • Specificity = Correctly categorized negative samples / Real negative samples.
• Positive predictive precision = Correctly classified positive samples / Positive classified samples.
• Negative Predictive Accuracy = Correctly Classified Negative Samples / Negative Classified Samples.
Here, the sample designates the input images used for learning the classifier.
In this paper, after cross-validation, the trained SVM classifier gives an accuracy rate of 97%, a sensitivity of 99%, a specificity of 90%, a positive predictive accuracy of 94% and a negative predictive accuracy of 99.9. %.
After the training, we tested the trained classifier's performance on 75 fundus images (25 healthy and 40 affected glaucoma) that were not part of the set of input images. The SVM classifier can successfully classify this test with an accuracy rate of 93%, a sensitivity of nearly 100%, a specificity of 66%, a positive predictive accuracy of 89.28% and a negative predictive accuracy of almost 100%.
The methods used in references [79][80][81][82]  We can observe that our SVM method is optimal compared to the other recent methods as regards the number of images and also the precision.
We used more images for classification and testing than methods 1 and 2 (Tables 6-10).
Predicted by the observed graphs Figures 24 and 25.

Discussion
Descriptive statistics to examine features extracted to distinguish glaucomatous from healthy subjects t-test and chi-square test Data were presented as means (standard deviation) for continuous variables and as proportions for categorical variables. Comparison of continuous variables between groups (CDRv and CDR H ) was made with independent Student's t-test. For discrete variables, distribution between groups was compared with Chi-square test as appropriate (where an expected cell is <5). Cup diameter (V&H) -Disc diameter (V&H) related differences in the occurrence of excavation among patients with glaucoma disease were evaluated using Chi-square for drift. All statistical analyses were carried out using the Statistical Packages for Social Sciences (SPSS Inc., Chicago Illinois, USA) software version 16.0. Statistical tests with two-tailed p < 0.05 were considered statistically significant.
A Paired sample t-test compares means from the same group at different axis, as shown in Tables 11 and 12.

The null hypothesis for the independent samples t-test is:
μ 1 (0,6995) = μ 2 (0,6908). In other words, it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H 0 : µ d = 0). The difference between the two tests is very subtle (0,00877); which one you choose is based on the data collection method.
The following formula is used to calculate the t-score: Every t-value has a p-value (p = 0.05, level confidence is 95%) to a) The number of hidden units is determined by the test data criterion: The "best" number of hidden units is the one that produces the smallest error in the test data.    go with it. A p-value is the probability that the results from the sample data occurred by chance. The p-value in the t-table, using the degrees of freedom for this sample problem, with df=63, the t-value is 2,000.
The calculated t-value is smaller than the table value at an alpha level of 0.05. The p-value is less than the alpha level: p <0.05. We can reject the null hypothesis that there is no difference between means (Table 12-14).

The percentile plots
There are a couple of reasons for preferring percentile plots to cumulative fractions plots. It turns out that the percentile plot is a better estimate of the distribution function (if you know what that is). In addition, plotting percentiles allows you to use "probability graph paper" plots with specially scaled axis divisions. Probability scales on the y-axis allows you to see how "normal" the data is. Normally distributed data will plot as a straight line on probability paper. Lognormal data will plot as a straight line with probability-log scaled axes. (Incidentally, uniformly distributed data will plot as a straight line using the usual linear y-scale) ( Table 15).
The cases are unweighted.

The KS-test
The KS-test reported the CDR V data (treatment A) in the second example was approximately lognormal with geometric mean of 0,6995 and multiplicative standard deviation of 0,17631. In the below plot, I display the percentile plot of this data (in red) along with the performance expected for the above lognormal distribution (in blue).
Note that the KS-test reports that both treatment B and control B data are approximately lognormal. Thus, you could take the log of all the data, and use the resulting data in a t-test. Since the t-test is a quite sensitive test when applied to appropriate data this would be the best strategy.
Similar consideration of the CDR H data (treatment B) in the first example lead to the following plot. Here the KS-test reported that the data was approximately normally distributed with mean = 0,6908 and standard deviation = 0,15887 ( Figure 30).
These datasets were drawn from lognormal distributions that differ substantially in mean. The KS test detects this difference; the t-test does not. Of course, if we knew that, the data were non-normally distributed; we would know not to apply the t-test in the first place.
One of the advantages of the KS-test is that it leads to a graphical presentation of the data, which enables the user to detect normal distributions (see below). For larger datasets (say N>40), the Central Limit Theorem suggests that the t-test will produce valid results even in the face of non-normally distributed data. However, highly non-normal datasets can cause the t-test to produce fallible results, even for large N datasets. In the last example you will see a case where the t-test fails at N=80 (Figure 31) (Table 16).
These distributions as well as all other normal distributions are symmetrical with relatively more values in the center of the distribution and relatively fewer in the extremities.
The Kolmogorov-Smirnov test (KS-test) tries to determine if two datasets (CDR V & CDR H ) differ significantly. The KS-test has the advantage of making no assumption about the distribution of data. (Technically speaking it is non-parametric and distribution free.) Note however, that this generality comes at some cost: other tests (for example Student's t-test) may be more sensitive if the data meet the requirements of the test. In addition to calculating the D statistic, this result will report if the data seem normal or lognormal. (If it is silent, assume normal data) It will enable to view the data graphically which can help to understand how the data is distributed.
Using 75 images obtained from a clinical case of a glaucomatous subjects, the performance of our approach is evaluated using the proximity of the calculated CDR to the manually graded CDR. The results indicate that our approach provides 98% accuracy in glaucoma analysis. As a result, this study has a good potential in automated screening systems for the early detection of glaucoma. a. 62 cells (100,0%) have expected frequencies less than 5. The minimum expected cell frequency is 1,0. b. 43 cells (100,0%) have expected frequencies less than 5. The minimum expected cell frequency is 1,5. c. 46 cells (100,0%) have expected frequencies less than 5. The minimum expected cell frequency is 1,4. d. 42 cells (100,0%) have expected frequencies less than 5. The minimum expected cell frequency is 1,5. e. 47 cells (100,0%) have expected frequencies less than 5. The minimum expected cell frequency is 1,4.   We presented a novel automated glaucoma classification system using digital fundus images. In contrast to the commonly used segmentation based measurements, it is purely data-driven and uses image-based features that are new in the domain of glaucoma recognition. We evaluated several combinations of image-based features and classifier schemes on a set of 75 real fundus images. The 2-stage classification with SKVMs produced 96% success rate. The performance of the fully automatic system presented here is comparable to medical experts in detecting glaucomatous eyes and it could be used in mass-screenings. The important features automatically identified by the methods also provide a novel representation of the data for the physicians and may help to enhanced understand glaucoma disease.

Conclusion
Glaucoma is an eye disease that can cause blindness if it is not detected and treated at the right time. The increase in intraocular pressure (IOP) of the fluid in the eye often causes glaucoma. Glaucoma is the second leading cause of blindness in the world and is called the "silent thief of sight".
Optical coherence tomography (OCT) and Heidelberg retinal tomography (HRT) techniques for the detection of glaucoma are very expensive. A method to diagnose glaucoma using digital images of the fundus is presented in this paper. The purpose of our proposed method is to apply image processing techniques to the fundus digital images for the analysis of the glaucomatous eye and the healthy eye. Image pre-processing techniques such as noise suppression and contrast enhancement, Hough Circular Transformation (CHT) method and Chan-Vese model active contours for feature extraction and attributes to identify glaucoma pathology and SVM (Support Vector Machine) method for image classification are used in the proposed method. All of these techniques are implemented via MATLAB and C # which offers a variety of options for image processing that allow us to extract the required features and information from the images.
Nevertheless, our results indicate that solutions obtained through SVM training appear to be more robust with a smaller standard error compared to standard ANN training. Regardless of the outcome of this study, it is the appropriate choice of learning data and descriptors, and reasonable scaling of the input variables determines the success or failure of machine learning system. Both methods are suitable for evaluating the utility of different sets of descriptors for a given classification task, and they are methods of choice for rapid first-pass filtering of compounds in the dataset. A particular advantage of SVM is "dispersion of the solution". This means that an SVM classifier depends only on the support vectors, and the classifier function is not influenced by all the data, as is the case for many neural network systems. Another enhancement of SVM is the ability to deal effectively with a very large number of features due to the exploitation of kernel functions, making it an attractive technique. The combination of SVM with a selection of routine features could provide an effective tool for digitally extracting relevant information.