An intelligent framework for medical image retrieval using MDCT and multi SVM.

Technology and Health Care 22 (2014) 13–25 DOI 10.3233/THC-130767 IOS Press

13

An intelligent framework for medical image retrieval using MDCT and multi SVM J.A. Alex Rajju Balana and S. Edward Rajanb,∗ a Vins

Christian College of Engineering, Nagercoil, Tamil Nadu, India of Electrical and Electronics Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamil Nadu, India

b Department

Received 16 September 2013 Accepted 21 November 2013 Abstract. BACKGROUND: Volumes of medical images are rapidly generated in medical field and to manage them effectively has become a great challenge. This paper studies the development of innovative medical image retrieval based on texture features and accuracy. OBJECTIVE: The objective of the paper is to analyze the image retrieval based on diagnosis of healthcare management systems. METHODS: This paper traces the development of innovative medical image retrieval to estimate both the image texture features and accuracy. The texture features of medical images are extracted using MDCT and multi SVM. Both the theoretical approach and the simulation results revealed interesting observations and they were corroborated using MDCT coefficients and SVM methodology. RESULTS: All attempts to extract the data about the image in response to the query has been computed successfully and perfect image retrieval performance has been obtained. Experimental results on a database of 100 trademark medical images show that an integrated texture feature representation results in 98% of the images being retrieved using MDCT and multi SVM. CONCLUSION: Thus we have studied a multiclassification technique based on SVM which is prior suitable for medical images. The results show the retrieval accuracy of 98%, 99% for different sets of medical images with respect to the class of image. Keywords: Medical image retrieval, MDCT, multi SVM, CBIR

1. Introduction Nowadays computer and digital imaging technologies in the medical domain play important roles [1]. The medical research centers and the hospitals produce thousands of images of different modalities such as Magnetic Resonance image, ultrasound image computer tomography etc. [2]. The image acquisition devices capture the medical image and other images. The difference between those digital images can be identified only by the experts in the medical field, as the variation in the visual attributes may be very subtle. Due to the tremendous development in the growth of World Wide Web medical research ∗ Corresponding author: S. Edward Rajan, Department of Electrical and Electronics Engineering, Mepco Schlenk Engineering College, Sivakasi, Tamil Nadu, India. E-mail: [email protected].

c 2014 – IOS Press and the authors. All rights reserved 0928-7329/14/$27.50

14

J.A.A.R. Balan and S.E. Rajan / An intelligent framework for medical image retrieval using MDCT and multi SVM

modalities we acquire different resources effectively. Moreover, the utilization of medical images is very important for diagnosis. Recently, many hospitals and radiology departments are equipped with picture archiving and communication system where the images are commonly stored, retrieved and transmitted in the DICOM format. It provides only text based retrieval capabilities using patient ID numbers, names, and other technical parameters [2]. The effective Content Based Image Retrieval using SVM classification technique plays an enormous role in retrieving necessary information within the huge amount of medical images. Typical CBIR System follows two steps. The first level is the low level features which include texture and shape extracted from each image in the medical image database. The second level is the query image which is computed and compared to the entire feature of medical image database. However, the Support Vector Machine (SVM) forms a useful technique for pattern classification, and regression problems [9]. Excellent and efficient results are reported in this paper by applying SVM in multiple domains. However, the application remains problematic [7]. Furthermore the data set images can be represented by large sequences and the accuracy in training is successful. Automated solutions following this approach can be categorized as CBIR scheme and SVM scheme with modified discrete cosine transform. These approaches are sensitive to the changes happening during the image retrieval conditions. This is processed by indexing and feature extraction methods. The purpose of this method is to successfully verify the class and retrieve the image based on the image size and generation of the query image. In this paper, we propose a medical image retrieval using modified DCT using SVM. It has been proved successful at the stage of extraction of image data. The technique assumes eight DCT coefficients with discriminating features. Secondly, we down sample the image and divide it into 8 × 8 subblocks. The complexity of this technique is little and produces large image subbands Thus we assign classes for the entire database with respect to the features extracted. The layout of the paper is structured as follows. Section 2 explains the content based image retrieval. Section 3 describes the proposed methodology i.e SVM classifier, which is the main contribution of this paper. The context of the framework, the practical benefits and results are discussed in Section 4. The discussion on comparison of image retrieval methods [classification may omit tasks] are presented in Section 5 followed by conclusion. 2. Methods 2.1. Content based image retrieval CBIR system uses the contents of medical images to represent and access the images in large scale and is capable of carrying out a search for medical image based on modality, anatomic region and different acquisition views [1]. The principal aim is to retrieve the medical images with good accuracy. Retrieval of images is often done with an example of query image providing an efficient way of searching the image database. The system measures and retrieves the results most similar to the query image with a feature vector [8]. The accuracy between the feature vectors of query image and the dataset are measured [2] efficiently by searching with the image database. 2.2. Feature extraction Representation of images from large image database need to be consideredstrictly, in which features


15

Fig. 1. Block diagram of CBIR.

are most useful for representing the contents of images and can effectively code the attributes of the images. As feature extraction of the image in the database is typically conducted off-line, computation complexity is not a significant issue. This feature extraction will introduce two features, texture and color which are often used to extract the features of the image [9]. 2.3. Color Color is a powerful descriptor that simplifies the object identification and it is one of the most frequently used visual features for content based image retrieval. To extract those features, proper color space has to be determined. The main purpose of the color space is to facilitate the specification of colors. Several color spaces such as RGB, HSV have been developed for different purposes. An appropriate color system is required to ensure perceptual uniformity [4]. RGB color space is a widely used system for representing color images. The color space is an intuitive system, which describes a specific color by its hue, saturation and brightness manipulation. After selecting the color space, an effective color describer should be developed in order to represent the color of the global areas [7]. Several color descriptors have been developed from various representation schemes, such as color histograms, color moments, color edge etc., For example, color histogram, represents the distribution of the number of pixels for each quantized color of the content of an image [3]. The retrieval of medical images allows images to reveal many characteristics. Color also plays an important role in morphological diagnosis. In medical field, the color images are usually produced in different departments by various devices. For example, images produced by camera is put into the organs of the body such as stomach, abdomen etc. and hence effective use of various color information in the images includes absolute color values, difference in colors and estimated illumination data [8]. 2.4. Texture Texture representation in CBIR approaches is classified into statistical approaches and structural approaches. Statistical approaches analyze textural characteristics according to its statistical distribution of image intensity. It includes gray level; fractural model. If the medical images are represented in gray level, texture becomes a crucial feature about the surface orientation and spatial distributions. Therefore it specifies the structure and thereby enhances the resolution and sharpness of the image [5]. Fractural model is not crucial. It determines the overall image resolution.

16


2.5. Dimension reduction and indexing In an attempt to capture useful contents of an image and to facilitate effective querying of an image database, a CBIR system extracts large number of features from the content of an image. Feature set of high dimensionality causes the curse of dimension problem in which the complexity and computational cost of the query increase exponentially with the number of dimensions. To reduce the dimensionality of a large feature set, the most widely used technique in image retrieval is principal component analysis (PCA) [2]. The goal of principal component analysis is to specify as much variance as possible with the smallest number of variables. PCA involves transforming the original data into a new system with low dimension, and thus creating a new set of data [10]. The new system removes the redundant data, and new set of data represents the essential information. However, the accuracy obtained will lead to the completeness of the retrieved information extracted. The data represented in perfect dimensions increases the speed of retrieval. 2.6. Indexing Retrieval of the medical image is usually based not only on the value of certain features, but also on the location of a feature vector in the multidimensional space. A retrieval query on a database of multimedia with multidimensional feature vectors usually requires fast execution of search operations. So an appropriate multi-dimensional access method has to be used for indexing by feature space with a unit of 20 dimensions. The feature space is necessary to reduce the dimensionality using multivariate analysis techniques such as PCA. With regard to medical CBIR research, the system prunes the set of retrieved classes of medical images. 3. Proposed methodology: SVM classifier a training phase Support Vector Machine is an emerging learning technology that has been successfully used for learning methods. This SVM has been used in various fields of study for analyzing retrieval and classification of images. SVM is a powerful supervised classification technique for learning and training of images. The proposed algorithm uses the information about the existing relationship between the set of images used and the elements to be classified. The images can be structured as the number of classes that can be subdivided into classes for different images in the database. The algorithm proceeds through two main phases. The first one is the training phase and second one is the classification (testing) phase. The first phase uses the training set xi and yi for training the image. The training can be carried out for any number of images with respect to the features. Features are texture, size and color and thereby produce the weights which will be used in the next phase. The second phase classification (testing) is used for testing and assigning classes for all the images which are under test and analyses the data which recognizes the patterns used for classification and regression analysis [1]. SVM forms a perfect decision function and were originally designed for binary classification dealing with medical image classification with multiclass method [2,7]. The SVM can be generalized as linear and non-linear classifiers. The linear SVM can be expressed as, O = W ∗x − b And W = ∞i y i xi i

(1)


17

Fig. 2. Proposed block diagram.

where xi is the programming method xi and yi are the training sets. The function of x, w, b can be achieved by standard programming method, based on the linear classification of SVM, it can be written as (2) f (x) = sgn xi0 y i xit x = b0 i=1

Under the non-linear case the SVM function is given by f (x) = sgn xi y i k(xi, x) + b i=1

(3)

These functions are originally designed for Multi classification. In this proposed medical image retrieval system, the query image is normalized from the original image and the RGB color value is determined. The medical images are taken from the drive database for the proposed work. Here SVM is evaluated as multiclass SVM and grouped into 4 classes of medical images among the data set consisting of 100 images. First, we collect the medical images from data base under non overlapping 8 × 8 block pixel. Next the images are down sampled in 64 × 64 i.e., to reduce the size and divide the images into subblocks of 8 × 8 pixel. For each subimage apply MDCT by leaving the first AC coefficient and take 8 AC coefficients for the remaining sub images, and thereby generate the query image, down sample it and divide it into 8 × 8 block pixels and assign label (class) for the image. The multi SVM takes the features of the images as texture, color and size in the database and separate the images into classes. Here a label means class, that is class 1, class 2, class 3 etc. We can assign classes for all images in the database. Suppose if the image retina is of class 1 the other image brain will be class 2 and iris will be class 3, because they should not overlap or confuse during training and retrieval of images. If class 1 is the input i.e. (retina) the output should be retina that is class 1.This is same for all the images that are assigned with their classes. The input of SVM will be the images of assigned class (label) from the database with features and the query image. Therefore if iris image is the query image the SVM output produces the same image with the same assigned class1. In this manner the proposed work has been done perfectly for the images in the database. The features of database image with assigned class and the query image will be the input of SVM. The classes of all query images will be displayed at the output. 3.1. Mechanism of SVM The mechanism of the proposed SVM will act as a training phase and the testing phase with medical images. The features of database image are not used as support vectors; hence in this work SVM is used

18

J.A.A.R. Balan and S.E. Rajan / An intelligent framework for medical image retrieval using MDCT and multi SVM Iris subblock 8X8 pixel

Brain subblock and query image

Fig. 3. Subblock images and original image.

as an actual training phase for the images in the database in order to classify them with their assigned corresponding classes. Moreover SVM has the flexibility in separating the choice of classes of different images and thereby successful implementation is carried out in our work. Thus accuracy obtained is measured as 97% and 98% for the images in the medical database. 3.2. Extraction In this proposed medical image retrieval system, the query image is normalized from the original image and the RGB color value is determined. The medical images are taken from the drive database and extracted. Here support vector machine is validated among the data set consisting of 100 images and grouped into 4 classes of medical images. First, we collect the medical images from data base under non overlapping 8 × 8 block pixel. Next the images are down sampled in 64 × 64 i.e., to reduce the size and divide the images into subblocks of 8 × 8 pixel [6]. For each sub image apply MDCT technique (Algorithm) by leaving the first AC coefficient and take 8 AC coefficients for the remaining subimages, and thereby generate the query image, down sample it and divide it into 8 × 8 block pixels. 3.3. Determining the feature selection and class of medical image The feature selection of image determines by assigning classes for all the images (class1, class2, class3, class4) in the database. After assigning the class and by generating the query image the sampled and retrieved simulation output will be obtained. If the classes were increased from the experimental four classes to any number we can achieve highly successful results and it will be conserved, because all the images are down sampled and divided into 8 × 8 sub blocks. The gray level images and the RGB images will remain in their own values after down sampling. In this method it is easy to identify the areas of color, black, white etc. The same principle is used for all the different types of images in the database. Hence we train the image first and go for testing after assigning class. The training and classifying options allow running the images in both phases of the algorithm. By this technique if we use more number of images the assigned classes will be more to a real life number and do not overlap because each image has its own specific class while retrieval and thereby we can protect and recall easily the image assigned with class after successful results are obtained. This analysis was described in detailed by Brown et al. (2000). The best classifier is the one that works the best for this particular application. Thus SVM is one the strongest well tuned classifiers. With strong theoretical foundation available support vector machine has been used for object recognition, text classification and learning image retrieval systems. Hence MDCT algorithm is used for training the classes of images and in the retrieval of images. The Modified discrete cosine transform is a very sensitive algorithm and it relies on selection, computation, efficiency and accuracy.


19

Fig. 4. Simulation result of brain image.

Fig. 5. Simulation result of abdomen image.

4. Results and discussion The results of this section provide interesting anecdotal evidence in support of the SVM classifier combined with MDCT. The main objective of this technique is to retrieve the image perfectly. This method is straightforward to classify all the set of medical images according to their pixel values. Besides the classes are grouped into four with the presence of query image. The results demonstrated show the effectiveness of the proposed method. The results obtained are compared with the classes and the classification accuracy has been calculated. The table values clearly show the accuracy for the different test images. The figures (simulated image) in this system predict the progressive improvement in the retrieval of images. The medical images retina, iris, brain and abdomen are taken from the database and presented effectively using SVM. The proposed method has been implemented on the database of 100 medical images. In this discussion, we have implemented 4 different classes of images. The retrieval results have been simulated and the performance measures, accuracy are calibrated using SVM classifier. The SVM ratio is the number of relevant images retrieved to the total number of relevant images in the database. The results are evaluated as, Number of relevant images retrieved × 100 SVM = Total number of relevant images in the database Thus we get 98% accuracy for the retrieved images of corresponding classes. This proposed method shows promising results based on the medical images using SVM classifier technique.

20


Fig. 6. Simulation result of iris image. (Colours are visible in the online version of the article; http://dx.doi.org/10.3233/THC130767)

Fig. 7. Simulation result of retinal image. (Colours are visible in the online version of the article; http://dx.doi.org/10.3233/ THC-130767)

4.1. Comparison An initial presentation, image retrieval usually starts from CBIR and subsequently extends into the technique of SVM in which the medical images are grouped into classes. Most of the medical images are available in the DICOM format which is capable of identifying the technical parameters. This is very significant and is an effective study in the field if imaging. This study involves comparison of images from database and is grouped with corresponding pixel values. Moreover they are confused in any modeling. It is clear that CBIR have been highly developed to extract several features regarding imaging technique. In this proposed study we found that the low level features and high level features are computed and compared using the medical image database. The features are extracted, indexed and an appropriate uniformity is formed in this system. The attainment of higher classification accuracy is obtained for different medical images and are shown in Table 1. This proves the excellence of this proposed method. We have presented approaches for feature extraction and SVM classes directly in the MDCT domain by attainment of MDCT coefficients. Here the retrieved results are appropriate with better accuracy and the AC coefficients are more important for the subblock division of images and the feature extraction. The SVM classifier classifies the retrieved images into four sets of multi classes.


21

Fig. 8. Retrieved images of iris using MDCT and multi SVM. (Colours are visible in the online version of the article; http://dx. doi.org/10.3233/THC-130767)

Fig. 9. Retrieved images of retina using MDCT and multi SVM. (Colours are visible in the online version of the article; http://dx.doi.org/10.3233/THC-130767)

22

J.A.A.R. Balan and S.E. Rajan / An intelligent framework for medical image retrieval using MDCT and multi SVM Table 1 Medical image dataset and their accuracy Images Img(1) Img(2) Img(3) Img(4)

Type of images on dataset Iris Retina Abdomen Brain

Testing Medical test image Medical test image Medical test image Medical test image

Accuracy 98% 99% 98% 97%

Fig. 10. Retrieved images of brain using MDCT and multi SVM.

For each class a query image will be generated and the sample of that image will be displayed. The retrieval result based on this class gives 99% accuracy. The image data set is a big set and yet it is one of the known data set for the SVM classification. It contains four classes and this class gives the MDCT a desired retrieved output. In this method we compare the results of different medical images (iris, brain) etc., from the data base and its performance is evaluated. Moreover, the complexity and accuracy are increased. The output of the result shows promising results. The performance graph is depicted between the image types and the classification accuracy of multi SVM. The medical image of four classes of SVM gives better accuracy using MDCT technique. After experimenting with MDCT the data set is found to be the best choice for these Multi SVM class functions yielding more (98%) accuracy. The retrieved images for the MDCT technique for all classes are shown below. 4.2. Performance evaluation In order to evaluate the features of images from the database, multi SVM classifier technique was


23

Fig. 11. Retrieved images of abdomen using MDCT and multi SVM.

analyzed and trials for four sets of classes and images were retrieved using MDCT based on CBIR model. Multiple similarity measures with respect to the classes of images are retrieved and validated over the classification accuracies. As a part, in this method the images are defined in the JPEG format with their corresponding pixel values and they are resized (i.e. down sample) into 64 × 64 pixel values. The choice of these values is again subdivided into 8 × 8 pixel values. Hence for each subimage we generate a query image by applying MDCT coefficient to focus on the retrieved image of all classes. In this proposed research work we had the trials of 98% and 99% rates. This has been done successfully while training and testing the images for all the trials. Likewise if all the images in the database are taken and trained and tested using this SVM technique it will give the same result. This can be done effectively by assigning classes for all the images in the database ie., class 1, class 2, class 3, class 4, class 5, etc. Each class will be trained and tested. Suppose if class 1 is of abdomen image and if it is tested and produce the same class we say it is the best trial with all the features observed clearly. This type of class we say it is a correctly classified image with successful classification rates, since it has produced the same number of classes of image. The next category relies if class 20 (eye image) or class 25 (skull image) is tested and if it shows different class such as class 11 or class 18 with different images in that class we say that this is a misclassified case. The image is said to be the misclassified image with poor classification rate, since it does not produce the same class of image which is assigned also different image features will be produced. Hence difference can be noticed between a classified image and misclassified image. This is similar for any number of images in the database. From this study we can differentiate the correctly classified image and the misclassified image with their classification rate

24


Fig. 12. Graph representing classification accuracy and medical image types.

obtained at the output. Hence majority of our works have high potential to provide better simulation and extraction of images in this system. However it is realized that SVM is a standalone diagnostic tool for all the clinical trials in the medical field as well as in the quality of feature extraction and representation. The performance evaluation of all the proposed medical images are reported in metrics of charts. The highest overall classification accuracy of 99% was achieved for different sets of medical images. The demonstrated results show the graph of images in the data set with respect to classification of accuracy. The accuracy obtained for the four medical image data set (iris, retina, abdomen, brain) are 98%, 99%, 98%, and 97% is depicted below.

5. Conclusion SVM provides a good out-of-sample generalization, which is more suitable for assigning different classes for medical images and it can be robust, even in the training phase. In this proposed work SVM deliver a unique solution, since the optimality problem is turned off. Hence SVM can perform solutions for different samples. They are used to compare the performance of different classes of images in our medical data sets. In this research paper we have investigated a multiclassification technique based on SVM classification which is suitable for medical images. The image features were validated by Modified discrete cosine transform. The innovative medical image was retrieved and extracted from this method and the accuracy was calculated and better results were obtained for the medical images retina as 99%, iris as 98% respectively since SVM is the best for reporting the accuracy. The texture features extracted by using MDCT, resulted in a better way with the separation of all classes. Simulation results eventually show the retrieval of different class of image for the samples. Therefore, this method mainly improves the accuracy and influences the performance analysis and classification of the medical image. SVM is one of the powerful classifiers, not necessarily the best always as classification process depends on various other factors like quality of features, data set of training etc. Hence it is believed that based on this model future work will study and develop some modern strategy for more number of medical databases to enhance the training and evaluation of research method.


25

Authors contribution ARB carried out the literature survey and collected image database related to the work and participated in sequence alignment of the manuscript. Dr. SER participated in the design of study and performed the statistical analysis and simulation of the work and helped in sequence alignment and drafting the manuscript. All authors read and approved the final manuscript. Acknowledgements The Authors would like to thank the management and Principal of Mepco Schlenk Engineering College, Sivakasi for providing us to carry out our research work in their research centre in collaboration with Anna University Chennai. List of abbreviations MDCT: Modified Discrete Cosine Transform. SVM: Support Vector Machine. CBIR: Content Based Image Retrieval. DCT: Discrete Cosine Transform. PCA: Principal Component Analysis. References [1]

Md. Mahmudur Rahmana, Bipin C. Desaia, Prabir Bhattacharya, “Medical image retrieval with probabilistic multi-class support” Computerized Medical Imaging and Graphics 32 (2008) Elsevier 95-108. [2] Ramesh Babu Durai C. Balaji V. ‘Improved Content Based Image Retrieval using SMO and SVM Classification Technique’. European Journal of Scientific Research ISSN 1450-216X Vol.69 No.4 (2012), pp. 560-564. [3] Yi Liu and Yuan F. Zheng One-Against-All Multi-Class SVM Classification Using Reliability Measures, IEEE Explore for Electrical and computer engineering ohio state university USA 2010, pp. 988-999. [4] K.Rajakumar*, S. Muttan “Medical image retrieval using modified DCT” Procedia Computer Science Vol 2 Elsevier (2010) 298-302. [5] Alexandre Lemieux and Marc Parizeau “Flexible multi-classifier architecture for face recognition systems” Lemieux & Parizeau, Vision Interface April 2003. [6] Tomá Pevnýa, Jessica Fridrich “Merging Markov and DCT Features for Multi-Class JPEG Steganalysis” Binghamton University, State University of New York July 2000/ICME. [7] Michael E. Mavroforakis and Sergios Theodoridis, Senior Member, IEEE “A Geometric Approach to Support Vector Machine (SVM) Classification” IEEE Transactions on Neural Networks, vol. 17, no. 3, May 2006 pp. 671-682. [8] Yan-Shi Dong Ke-Song Han Boosting SVM Classifiers by Ensemble ACM 2005, May 10-14, pp. 1072-1073. [9] Glenn M. Fung O. L. Mangasarian Multicategory Proximal Support Vector Machine Classifiers 2005 Springer Science + Business Media 2005, 77-97. [10] Bhoomika Panda*, Debananda Padhi2, Kshamamayee Dash “Use of SVM Classifier & MFCC in Speech Emotion Recognition System” International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 3, March 2012.

An image retrieval framework for real-time endoscopic image retargeting.

Medical Image Retrieval Using Multi-graph Learning for MCI Diagnostic Assistance.

Medical Image Retrieval: A Multimodal Approach.

Dictionary Pruning with Visual Word Significance for Medical Image Retrieval.

Multiview locally linear embedding for effective medical image retrieval.

Evaluating performance of biomedical image retrieval systems--an overview of the medical image retrieval task at ImageCLEF 2004-2013.

Facilitating medical information search using Google Glass connected to a content-based medical image retrieval system.

FAST: framework for heterogeneous medical image computing and visualization.

Multimodal medical image fusion using improved multi-channel PCNN.

Towards an intelligent framework for multimodal affective data analysis.

An intelligent space for mobile robot localization using a multi-camera system.

Axial multi-image phase retrieval under tilt illumination.

Content-based histopathology image retrieval using CometCloud.

Multi-object segmentation framework using deformable models for medical imaging analysis.

Optimal query-based relevance feedback in medical image retrieval using score fusion-based classification.

Texture-based medical image retrieval in compressed domain using compressive sensing.

Endowing a Content-Based Medical Image Retrieval System with Perceptual Similarity Using Ensemble Strategy.

PDE based scheme for multi-modal medical image watermarking.

Eigenanatomy: sparse dimensionality reduction for multi-modal medical image analysis.

Software suite for image archiving and retrieval.

Efficient Multi-Atlas Registration using an Intermediate Template Image.

Design and development of a content-based medical image retrieval system for spine vertebrae irregularity.

An Intelligent Model for Pairs Trading Using Genetic Algorithms.

A Visual Analytics Approach Using the Exploration of Multidimensional Feature Spaces for Content-Based Medical Image Retrieval.