Chin J Integr Med 2015 May;21(5):323-331

• 323 •

FEATURE ARTICLE Big Data Is Essential for Further Development of Integrative Medicine LI Guo-zheng (李国正) and LIU Bao-yan (刘保延) ABSTRACT To give a short summary on achievements, opportunities and challenges of big data in integrative medicine (IM) and explore the future works on breaking the bottleneck to make IM develop rapidly, this paper presents the growing field of big data from IM, describes the systems of data collection and the techniques of data analytics, introduces the advances, and discusses the future works especially the challenges in this field. Big data is increasing dramatically as the time flies, whatever we face it or not. Big data is evolving into a promising way for deep insight IM, the ancient medicine integrating with modern medicine. We have great achievements in data collection and data analysis, where existing results show it is possible to discover the knowledge and rules behind the clinical records. Prof. LI Guo-zheng Transferring from experience-based medicine to evidence-based medicine, IM depends on the big data technology in this great era. KEYWORDS big data, analytics, integrative medicine, Chinese medicine, healthcare, methodology

In the era of big data, people hold further understanding about the valuable rules acquired from considerable data gradually. The greatest impact on us from big data is brought by the large quantity of data around us, the abundant laws and principles contained, the value these laws possessing that can convert our understanding and utility about the world. Big data has attracted more and more attention in the medical and healthy field. (1) There is a representative case about Google search on historical information and data to predict outbreak and grade of influenza. The successful prediction in 2009 of the H1N1 outbreak in America(2) and precise judgment 'strong' on its activity grade in January, 2013 are consistent with subsequent reports from Center for Disease Control and Prevention in USA.(3) Before the appearance of the noun 'big data', massive data processing has developed for a long period. Data generation has three stages: passive phase, various applications were based on database, data is input to the system passively; active phase, it means the new internet application as web 2.0 where users actively send data; automatic stage: all kinds of perception systems can be widely used with data automatic generation. We can't reverse but follow the trend and wonder how to discover characteristics of big data in medical and healthy field, how to improve the

curative effect by means of data, and how to promote the positive role of big data in medicine and health care. The definition of big data has been far from uniform up to now. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time.(4) According to the definition of massive data above, we consider the quantity of data ideally considerable so that we can attain the data possessing knowledge value unrecognized previously. The soul of big data does not lie in the quantity of outstanding data, but the impact on our thinking its quantity brings by which we can solve tasks that considered unaccomplished before, for example, the influenza prediction by Google.(2) With the rapid development of modern computing technology, the characteristics of medical and health data are embodied in the following points. (1) Diversification of data structure: there are © The Chinese Journal of Integrated Traditional and Western Medicine Press and Springer-Verlag Berlin Heidelberg 2015 Supported by the Natural Science Foundation of China (No. 61273305) Data Center of Chinese Medicine, China Academy of Chinese Medical Science, Beijing (100700), China Correspondence to: Prof. LI Guo-zheng, Tel: 86-15021800128, E-mail: [email protected] DOI: 10.1007/s11655-015-2169-3

• 324 •

varied data of different modes and sources like text, digit, image, sound, signal in electronic medical records. And the medical literature data also need to be digitized, structured, etc. The data structure is very complicated even if digitalized. (2) Fast data generation: electronic medical records with huge quantity of data update frequently and continuously in hospitals, especially in general hospitals with tens of thousands of daily visits. (3) Abundant value contained: medical data contain plenty of high-valued information about the human health and medical treatment. It will be considered precious for a citizen wondering changes in his body, a doctor in hope of in-depth analysis of diagnosis rules, a government which needs to monitor changes in disease spectrum, or a drug merchant who wants to know the interactions between different drugs.

Big Data Is Essential for Integrative Medicine The real-world Chinese medicine (CM) clinical research paradigm has been proposed since 2013,(5) which is human-centered, data-oriented and problemdriven. With the idea of "from the clinic, to the clinic", this paradigm integrates scientific and clinical research, combining medical research and scientific computing. This paradigm inherits from the basic mode in CM research, and it is a fusion of concepts, theories and technologies from clinical epidemiology, evidencebased medicine, statistic and information science. Therefore the real-world clinical research paradigm may be suitable for integrative medicine (IM). According to the above paradigm, we formulate the diagnosis and treatment procedure of a real world clinical process as following: E = dt(S+F), where, E means the effect of the treatment, a vector containing many indexes of efficacy evaluation, while S is a set of symptoms and healthy functional parameters obtained by sensors and doctors, F represents the intervention means, dt is the function to model the process of diagnosis and treatment. Imagining that we have the ability to collect all the healthy functional parameters of one person including the symptoms when he is out of health, as well as the intervention means and efficacy evaluation indices of all the human beings, we may carry out a real world clinical research, and will discover all the oracles in IM. By using big data, we believe to find which symptom subset or healthy

Chin J Integr Med 2015 May;21(5):323-331

functional parameter subset is the key to diagnosis the disease, and which intervention means is more powerful. We also will know which index of efficacy evaluation is the final for one disease to our life. Since there are thousands of sensors to collect the healthy and unhealthy parameters of one human body, and we have not obtained enough cases, i.e. data, to build the model, we may focus our attention on the symptoms proposed by the traditional four diagnostic techniques, i.e. observation, listening, interrogation, palpation, and pulse-taking. We try to collect the symptoms by using the sensors, not only by using the traditional methods. E.g. we know the status of sweat of one person by traditionally using observation and interrogation, we now collect the parameters of sweat by using sensors now, in the means of quality and quantity as well as the period in one day. Then what we do is to study how to collect all the parameters of E, S, and F, and how to model the function of dt. We believe that big data techniques including sensors, internet of things, data mining and computing is necessary to advance the theory and techniques of IM.

Progress of Big Data in IM Hot Research Directions of Big Data in IM Analysis of large-scale scientific data could push forward the development of information science. Medical data analysis is the overlapping part of medical science and information science. It is a challenging task to the information science. With the rapid development of electronic medical records system, clinical integrative medicine has accumulated adequately. Data mining based on a large amount of effective diagnosis cases contributes to the in-depth understanding of IM diagnosis theories, conducive to the further development of CM diagnosis and treatment technology. Moreover, it can also propel the modernization and informatization of Integrative medicine. The degree of obtaining valuable knowledge and rules from clinical records used to depend on apprentices' talent in the past. As is often the case in clinical diagnosis, rules and knowledge merely though personal understanding can hardly be accurate and compelling. Now data mining technology develops very rapidly, and consequently by use of nascent technology we have no difficulty in finding out uncovered and potentially valuable rules or knowledge, promoting the advancement of IM treatment technology. IM clinical

Chin J Integr Med 2015 May;21(5):323-331

data mining contains abundant contents. It mainly concentrates on the relationships among symptoms, syndromes, disease, drugs and prescriptions, which includes a more significant study on the relation with treatment effect. Further analysis can be seen in the following aspects: (1) Pervasive medical equipment for symptoms collection and their data analysis.(5-13) (2) Structured knowledge system for electronic healthy record collection (14,15) as well as semantic network for the ancient literatures and modern literatures.(16,17) (3) Data analysis and knowledge discovery: symptom interactions analysis; (18,19) symptom and syndrome (disease) correlation analysis;(20-24) analysis of the core prescription and patent medicine addition &subtraction;(25) analysis of prescription and efficacy relationship.(26,27) (4) Introduction of bioinformatics and biomedical techniques to IM.(26-28)

Pervasive Medical Equipment for Symptoms Collection With the aid of pervasive computing device, information from multiple dimensions can be acquired such as time, location, environment, physiological signals and motion signals. Big data of human body can be collected through continuously monitoring multi-dimensional signals for a long period. These health related information will deliver huge value and have a promising market outlook. Four diagnostic instruments in CM consist of basic four acquisition parts: inspection, auscultation, inquiry and palpation. Inspection focuses on the analysis of tongue surface that is regarded as one of the main contents in traditional diagnosis. In tongue diagnosis information identification system, after the patient's tongue pictures are gained by digital instrument (digital cameras, webcams), intelligent automatic segmentation on the target area should be performed primarily. Given the structure characteristics of tongue body, Shi, et al(6,7) proposed a universal method of active contour initialization. Secondly the color space as the control information is introduced into the active contour mode and put forward a tongue image automatic segmentation algorithm: color control-

• 325 •

geometric and gradient flow snake (C2G2FSnake), which improves curve movement speed and reduces the complexity of control parameters, resulting in significant improvement of its segmentation accuracy and practicability. Facial complexion inspection including color and gross is one of the typical methods in CM diagnosis. The observation of changes in facial gloss helps to diagnose viscera essence, as an important reference to determine disease severity and prognosis. However, traditional facial gloss interpretation mainly relies on the subjective evaluation of clinicians in the lack of objective data to support. And it has become one of the important obstacles influencing the development of CM. About 3600 cases of patients' facial images are collected and accuracy of 84.5% is obtained in the judgment of facial gloss.(8) Based on these images, lip color, facial color and cluster are extracted by using support vector machines and information fusion techniques, (9,10) which scheme consists of four steps: image preprocessing, image feature extraction, feature selection and classification, and the best accuracy of 82% is achieved on a total of 257 lip images.(11) For inspection techniques, color correction is very important, which will seriously impact computeraided facial image analysis because it is on the basis of accurate rendering of color information. A novel color correction framework was proposed by utilizing undistorted facial images to demarcate complexion gamut, where, several training sets based on complexion gamut are compared experimentally for the selection of optimal training samples. This color correction framework is characterized by mission dependence and statistical reliability. Besides, its trained model has low complexity and high accuracy. Experimental results show it is effective for facial images color correction.(12) MARS500 study was a psychology and physiology isolation experiment conducted by Russia, the European Space Agency and China, in preparation for an unspecified future manned spaceflight to the planet Mars. Its intention was to yield valuable psychological and medical data on the effects of the planned longterm deep space mission. With feature selection, they screen out 10 key factors which are essential to syndrome differentiation in CM. The average precision

• 326 •

of multi-label classification model reaches to 80%.(13) The design of new diagnostic collection instruments should reflect the trend for portability as wearable diagnostic instrument. Further studies consider the miniaturization and portability of diagnostic instruments and the eletronization and mobilization of preliminary scales.

Structured Knowledge System for Electronic Healthy Record Collection as well as Semantic Network for the Ancient and Modern Literature IM data resources include clinical diagnosis and treatment data, monitoring data, biomedical clinical data, CM ancient books and all kinds of modern literature and technology. (14) Clinical data is the firsthand and vital evidence for CM and Integrative medicine clinical research. Plentiful empirical knowledge is embedded in the clinical data of high-experienced CM practitioner, which is proved remarkable therapeutic effects. Liu, et al.(4) constructed "digital clinical terms of CM application system", "highly structured clinical research information integrated data collection system", "clinical data processing system", "CM clinical data warehouse system", "CM clinical data multidimensional search query and display system" and "IM clinical data mining system", which have been built and put into use with the collection of nearly 200,000 copies of medical resources in over 20 hospitals including the national CM clinical research base. Additionally, hundreds of researches have been carried out based on clinical practice diagnosis and treatment information. A novel system, customized traditional Chinese medicine system (CCMS) is designed and implemented for customized preservation and intelligent analysis of CM clinical cases in this paper. Customized template with structured symptoms, diseases, syndromes and herb formula is constructed according to the clinical characteristic of each CM specialist. With the template, clinical records are customizedly collected and structuredly stored. Various data analysis and mining methods, grouped as basic analysis, association rule, feature reduction, cluster, pattern classification, and pattern prediction, are implemented and improved in the system for clinical data analysis.(15) A systematical and comprehensive study concentrated on available CM ancient literature has

Chin J Integr Med 2015 May;21(5):323-331

been conducted. CM ancient literature intelligent computer information system (KBS) was established eventually, which further facilitated network development services based on internet network and realized the industrial development of knowledge base.(16) CM science and technology literature sharing platform was built as the center of China Academy of Chinese Medical Science with WEB technology, establishing more than 100 literature databases involved with traditional medical science, IM, acupuncture, and ancient literature.(17)

Data Analysis and Knowledge Discovery Symptoms Interaction Analysis Most of the existing work of IM research based on machine learning does not think over the correlations between medical connotation and symptoms behind data. However, a large amount of symptoms and syndromes data of IM has its corresponding clear medical meaning. Thus, it is significant to study the interactions between symptoms and syndromes and IM knowledge behind these associations. At present only single numerical value is considered in traditional research method to describe the strength of corresponding relations and the method is oversimplified. Li, et al(18) propose the relative associated density (RAD) to describe the relationship between a pair of symptoms, and obtain the interaction between symptoms (symptomsymptom interaction: SSI) diagram, which shows the association between symptoms, in accordance with CM theories that these symptoms are important in the close relationship with coronary heart disease. The machine learning techniques are applied to the analysis of the medical records data, and extract the hidden information in clinical experience so as to realize the summary and inheritance of clinical experience. It is an effective way to promote informatization of IM. Feature selection can significantly improve the comprehensibility of classification model and establish a better classification model of more practical significance to predict unknown samples. The hybrid optimization based multi-label feature selection (HOML) is proposed and has shown great improvement on prediction accuracy from the analysis of the prediction accuracy of 6 symptoms on the optimal subset selected by HOML-multi-label K nearest neighborhood (ML-KNN), with an average increase of 14% in four

Chin J Integr Med 2015 May;21(5):323-331

multi-label learning methods.(19)

Symptom and Syndrome (Disease) Interaction Analysis In practical data mining tasks of CM, numerous clinical cases may have a certain one of various syndromes; this task can be regarded as a multi label classification problem in machine learning. The existing solutions on the multi label classification pay less attention to the problem of unbalanced data and label inconsistency.(20) Data mining algorithms is merely applicable to the problems of two classes in label uniform distribution rather than seriously imbalancedlabel data mining tasks in CM. You, et al(21) extended the asymetric partial least squares based classifier (APLSC) algorithm to the imbalance of the multi-label classification tasks, and employed the new algorithm to the CM clinical data mining. Multiple-APLSC (MAPLSC) was performed on liver disease database. The result indicates that the MAPLSC has better results in the macro average index and its performance testified good on multi-classification balance.(22) The existing research on syndrome standardization and objectification mainly solves the problems of single syndrome diagnosis, namely single-label learning, while ignoring the clinical practice in which the syndromes seldom appear alone but intertwiningly. For example, some research has shown that syndromes of coronary heart disease (CHD) usually come with both deficiency and excess, with four main forms: qi deficiency and blood stasis, qi deficiency and phlegm turbid, yang deficiency and blood stasis, qi stagnation and blood stasis. Liu, et al (23) proposed to introduce the multi-label mining algorithms into the study of CHD syndrome model, to explore the feasibility of multi-label learning in symptom prediction of CM inquiry. Syndrome differential treatment is one of the important characteristics of IM. As one of the key problems in the research of syndrome differential treatment, the "fusion use of four classical diagnostic methods" combines the data or the diagnosis conclusions of inspection, auscultation, inquiry and palpation. How to effectively use the method for better treatment is a very important issue in the study of CM data mining. Experiments with ML-KNN under 908 hypertension datasets in Guangdong Province Hospital of Chinese Medicine, compared and analyzed

• 327 •

model performances using features of inspection, auscultation, inquiry, pulse-taking respectively and the synthesis of the four diagnostic methods and proved that the information fusion is testified best to implement 'dialectical argumentation', in accord with CM theories from classical perspectives.(24) Syndromes seldom appear alone but intertwiningly. For example, some research has shown that syndromes of CHD usually come with both deficiency and excess mainly in four forms: qi deficiency and blood stasis, qi deficiency and phlegm turbid, yang deficiency and blood stasis, qi stagnation and blood stasis. Research on correlation of these syndromes by means of data mining is of remarkable significance in practical use. For one thing, it is feasible to verify the existing knowledge about relationship between syndromes in CM theory; for another, hope to automatically dig out syndrome relationship hidden behind data information using data mining. Syndrome analysis is conducted on coronary heart disease syndromes datasets. Use RAD technology for further analysis and obtain the one-way associations of syndromes.(18)

Core Prescription and Herbs Addition & Subtraction In terms of the core of the prescription for fatty liver disease (FLD), 6 herbs are recorded in monographs of veteran doctor of CM, Dr. ZHANG Yun-peng: Salvia sinica Migo , Cassia tora Linn., Curcuma aromatica Salisb , Alisma plantago-aquatica Linn, Lemna trisulca Linn, Tropaeolum majus Linn.(25) Xu, et al(26) found 8 herbs through data mining in his clinical data in 2008–2010: Salvia sinica Migo , Cassia tora Linn., Curcuma aromatica Salisb , Alisma plantago-aquatica Linn, Raphanus sativus Linn., Forsythia suspensa (Thunb.) Vahl , Lemna trisulca Linn, Tropaeolum majus Linn, including the 6 herbs summed up in the monograph, only left Raphanus sativus Linn. and Forsythia suspensa (Thunb.) Vahl not recorded.

Analysis of Prescription and Efficacy Relation CM has been testified itself by clinical efficacy. The utility of different prescriptions is a hotspot in CM research, making preparations for the progress of CM. Lu, et al(27) worked on the curative effect of the medicine etretinate from the comparison between health and psoriasis patients and their treatment cycles by using metabolomics technology, and

Chin J Integr Med 2015 May;21(5):323-331

• 328 •

furthermore found metabolomics tech could be well applied to analyze the effect of CM.

Introduction Bioinformatics and Biomedical Techniques to IM IM bioinformatics technology nowadays focuses on the analysis, classification and reorganization of literature material with massive information behind in pharmacy, chemical, pharmaceutical and biomedical fields. It aims to extract valuable information and dig out logical and rational knowledge from the comprehensive analysis of original plants, chemical constituents, pharmacological action of herbs and medicine properties, prescription compatibility, indications, efficacy in IM experimental theories, as well as the integration of numerous random biological experiments, chromatographic and spectroscopic data. To gain insights into the molecular mechanism of symptoms, Li, et al(28) developed a computational approach to identify the candidate genes of symptoms, where a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. Their method gets reliable gene rank list with area under curve (AUC) 0.616 in classification and discovers some novel genes which are not recorded in benchmark data sets but reported in recent publications. For cancer diagnosis, gene biomarkers are important to be identified, Zeng and Li(29) propose a supervised redundant feature detection method to remove the redundant genes and select the optimal gene biomarkers including strongly relevant and weakly relevant non redundant genes. Experimental results on benchmark datasets show that the proposed supervised framework performs better than the previous works.

Related Topics Smart Medical System The establishment of intelligent medical system for big data requires interdisciplinary composite technology related with four steps: data acquisition, data storage, data transmission, and data analysis. Promote the development of internet of things (IOT) technology, and further improve the standardization, correctness and universality of data collection, storage, and transmission. At present the acquisition

of clinical data in hospital costs a certain amount and needs artificial operation specially. By advocating the IOT, patient information and medical information can be automatically collected in hospitals, thus improving the integrity convenience and accuracy of information acquisition at lower expense. The IOT technology not only plays a vital part in information collection but intervention as well, which will remind patients of medication in personal time for better effect of drugs. The development of integrative medical computer technology and equipment can improve the objectivity of symptom acquisition and make symptom information quantified, in order to lay foundations for further consultation and analysis of the data mining. On some occasions like aerospace navigation, diagnostic computerized data collection technology facilitates remote and automatic collection serving for remote medical treatment. Put into families and communities, it can improve the efficiency of public symptoms collection and utility of large hospitals and outstanding experts. Extract the demand in IM data analysis and propel the development of intelligent information processing technology, and refine the performance of IM information processing. Research and develop intelligent information processing technology like data preprocessing, data mining, pattern analysis, classification, clustering technology aimed at such characteristics like multi-dimension, nonlinearity and dynamics in CM, and explore knowledge and rules implicit for human observation behind plenty of clinical data. Besides, along with experts experience advancing gradually, human mentality can be real-time tracked automatically by computers and summarized as latest differential thoughts and prescription experience, which accordingly improves the cultivation efficiency of novices. Only by exploiting intelligent information processing technology can we explore valuable information to establish CM differential treatment individualized system and accomplish realtime acquisition of IOT information and individualized medicine with the internet of thing bonded.

IM Computing The impending demand for big data based on data science, information science, intelligence science, complexity science, computing science has been revealed. It is necessary to associate

Chin J Integr Med 2015 May;21(5):323-331

related disciplines with characteristics of IM in problem solutions, such as data acquisition, analysis, management and validation. Instead of applying these techniques simply and separately, an indiscipline subject is required for two reasons: firstly, for a clear understanding of the data characteristic of IM theory and technical problems; secondly, for deep research in the data, information, intelligent, complexity science and technology fields. According to the requirements of data analysis, applicable theory and technology should be developed and proposed, and the research into IM big data fulfilled. IM computing is catered to these requirements above. We can treat it from two perspectives: one about the application into the clinical research of complex science, intelligence science, data science, and information science; another about the use and embedment in the related computing subjects such as IM thinking, theory and knowledge or more specific techniques that in return improve efficiency and quality of clinical research. The former is limited to technological level but has developed for a period, while the latter is nascent with a deeper sense of interdisciplinary related. From the framework of "disease-symptomsyndrome-prescription-efficacy", both symptoms objective collection and structured entry in IM require scientific computing as the basis for the further analysis of data mining. For inspection symptoms, image processing and pattern recognition technology are needed for information acquisition such as color of tongue and facial texture, color and luster; as for auscultation and smelling clinical symptoms, signal processing, voice analysis and pattern recognition technology are used to obtain the information such as voice, cough, breath, body odor; the optimization of inquiry scale draws support from machine learning techniques; pulse-taking is more need to vibration signal analysis and pattern recognition techniques to attain pulse rate, pulse rhythm, pulse force, pulse pattern and even pulse type. Based on various clinical symptoms, feature selection and classification modeling of data mining can optimize symptoms and obtain the optimal subset, as well as the simulation modeling from symptoms to syndrome. Complex network and association rule can be applied in the core prescription mining and its addition or subtraction. Use drug effects to discover significant interactions

• 329 •

from CM patient prescription. These studies concern the utility of intelligent data processing technology into the analysis of IM data. As we can see, IM computing is the specialization of scientific computing in the real world clinical research paradigm of IM.(30)

International Workshop on Information Technology for Healthy Big Data and IM To provide a platform for the researchers of CM big data, we organized the series of International Workshop on Information Technology for Chinese Medicine (ICM) in conjunction with Institute of Electrical and Electronics Engineers (IEEE) International Conference on Bioinformatics and Bio-Medical Engineering (IEEE-BIBM), which had been held in Hong Kong (China), Atlantic (USA), Philadelphia (USA), Shanghai (China) and Belfast (UK) from 2010 to 2014. During the last decade, IM has met a great opportunity with the development of China, where information technology becomes a critical technical support to produce great achievements in designing information systems to mine the ancient and present literature and clinical data, as well as to make objectivity of traditional diagnostics and formula. Under this circumstance, ICM aims to provide a common platform to bridge the important interdisciplinary areas into an interactive forum, and bringing together top researchers, practitioners and students from various countries in order to promote scientific understanding and findings in information technologies and CM. To manifest big data be the main topic in ICM, we would change the workshop title to be International Workshop on Information Technology for Healthy big data and Integrative Medicine (ITHIM) from 2015. Potential special research topics for big data in IM include: new standards for IM data; data acquisition, integration, cleaning, and best practices; data and information quality for big IM data; HCI challenges for big IM data security and privacy; data protection, integrity and privacy standards and policies; privacy preserving big IM data collection/ analytics; new computational models for big IM data; information integration and heterogeneous and multistructured data integration; visualization analytics for big IM data; semantic-based IM data mining and data pre-processing; multimedia and multi-structured data-

Chin J Integr Med 2015 May;21(5):323-331

• 330 •

big variety data; interfaces to database systems and analytics software systems; workflow optimization like clinical pathways; big IM data open platforms.

analysis for facial complexion in traditional Chinese medicine. BioMed Res Internat J 2014; Article ID 207589. doi:10.1155/2014/207589. 10. Li FF, Li GZ, Zhou R, Zhao RW, Wang YQ, Zheng XY. The

Conclusions

Chinese medicine clinic luster identification based on LDA,

Big data is increasing dramatically as the time flies, whatever we face it or not. Big data in IM is evolving into a promising field for deep insight the ancient medicine. We have great achievements in data collection and data analysis, where existing results show it is possible to discover the knowledge and rules behind the clinical records. Transferring from experience-based medicine to evidence based medicine, IM depends on the big data technology in this great era.

PLS, the world of science and technology. Modern Tradit Chin Med (Chin) 2011;13:977-981. 11. Li F, Zhao C, Xia Z, Wang Y, Zhou X, Li GZ. Computerassisted lip diagnosis on traditional Chinese medicine using multi-class support vector machines. BMC Complement Alternat Med 2012;12:127. 12. Niu JL, Zhao CB, Li GZ. A novel color correction framework for facial images. In: Proceedings of The 2014 International Conference on Medical Biometrics (ICMB'14) Shenzhen, China; 2014:47-54. 13. Li Y, Li GZ, Gao JY, Zhang Z, Fan Q, Xu J, et al. Syndrome

Acknowledgements Thanks to Prof. ZHANG Qi-ming for his discussion and Miss ZUO Xue-wen for her translation and editing.

differentiation analysis on MARS500 data of traditional Chinese medicine. Sci World J (in press). 14. Liu BY. Utilizing big data to build personalized technology and system of diagnosis and treatment in traditional

Conflict of Interests The authors declare we have no competing interests.

REFERENCES 1.

Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inform Sci System 2014;2:3.

2.

3.

data. Nature 2009;57:1012-1015.

digitization project of CM ancient literature. Proceedings of

Imai M, Watanabe T, Hatta M, Das SC, Ozawa M, Shinya

Conference on Chinese CM Information Research, Second

K, et al. Experimental adaptation of an influenza H5 HA

Session of Congress and Academic Exchange. 2003:1-5.

Liu BY. The real world of Chinese medicine clinical research Snijders C, Matzat U, Reips UD. Big data: big gaps of 2012;7:1-5. Shi M, Li G, Li F. C2G2FSnake: automatic tongue image segmentation utilizing prior knowledge. Sci China Inform Sci 2013;56: 092114. Shi MJ, Li GZ, Li F, Xu C. Computerized tongue image segmentation via the double geo-vector flow. Chin Med 2014;9:7.

8.

17. Yu Q, Cui M. The characteristics of the study of CM information. J Basic Med Tradit Chin Med Chin (Chin) 2012;18:1137-1139. 18. Li GZ, Sun S, You M, Wang YL, Liu GP. Inquiry diagnosis of coronary heart disease in Chinese medicine based on symptom-syndrome interactions. Chin Med 2012;7:article 9. 19. Shao H, Li G, Liu G, Wang Y. Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Sci China Inform Sci 2013;56:13. 20. You M, Li GZ. Medical diagnosis by using machine learning techniques. In: Josiah P, Simon P, eds. Data analytics for traditional Chinese medicine research. Springer; 2014:39-80. 21. Qu HN, Li GZ, Xu WS. An asymmetric classifier based on

Zhou R, Li F, Wang YQ, Zheng XY, Zhao RW, Li GZ.

partial least squares. Pattern Recognit 2010;43:3448-3457.

Application of PCA and LDA methods on gloss recognition

22. You M, Zhao RW, Li GZ, Hu X. MAPLSC: a novel multi-

research in CM complexion inspection, IEEE International

class classifier for medical diagnosis. Internat J Data Mining

Conference on Bioinformatics and Biomedicine Workshops (BIBMW). Hongkong, China 2010:666-669. 9.

processes. IGI-global: Med Info Sci Reference 2011;26-41. base system of traditional Chinese medicine literature

knowledge in the field of internet. Internat J Intern Sci

7.

nursing and personalized medicine: technologies and

Detecting influenza epidemics using search engine query

paradigm. J Tradit Chin Med (Chin) 2013;54:451-455.

6.

D, eds. Quality assurance in healthcare service delivery,

16. Liu CH. Contemplation on the construction of knowledge

HA/H1N1 virus in ferrets. Nature 2012;486:420-428.

5.

sharing of Cai's CM gynecology. In: Athina L, Andriani

Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Brilliant L.

confers respiratory droplet transmission to a reassortant H5 4.

Chinese medicine. Front Med 2014;8:272-278. 15. Li GZ, You M, Xu L, Huang S. Personalized experience

Zhao CB, Li GZ, Li FF, Liu C. Qualitative and quantitative

Bioinform 2011:5:383-401. 23. Liu GP, Li GZ, Wang YL, Wang YQ. Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese

Chin J Integr Med 2015 May;21(5):323-331 medicine by using multi-label learning. BMC Complement Alternat Med 2010:10:article 37. 24. Li GZ, Yan SX, You M, Sun S, Ou A. Intelligent ZHENG

• 331 •

Shanghai Tradit Chin Med (Chin) 2011;45:1-3. 27. Lu C, Deng J, Li L, Li GZ. Application of metabolomics on diagnosis and treatment of patients with psoriasis in traditional

classification of medicine hypertension depending on

Chinese medicine. Sys Biol Clin Impl 2014;1844:280-288.

ML-kNN and information fusion. Evidence-Based Complement

28. Li X, Zhou X, Peng Y, Liu B, Zhang R, Hu J, et al. Network

Alternat Med 2012;2012:837245. doi: 10.1155/2012/837245.

based integrated analysis of phenotype-genotype data for

Epub 2012 Jun 3.

prioritization of candidate symptom genes. Biomed Res Int

25. Shanghai Institute of Chinese Medical Literature, ed. Academic experimence on liver disease of Zhang Yun-peng. Shanghai: Shanghai Jiaotong University Press; 2008:10.

2014;2014:435853. 29. Zeng XQ, Li GZ. Supervised redundant feature detection for tumor classification. BMC Med Genom 2014;7(S2):S5.

26. Huang SY, Fang SC, Liu H, Zhang R, Wang CY, Bi L, et

30. Li GZ, Zuo X, Liu B. Scientific computation of big data in

al. The technology to carry out the old doctor of traditional

real-world clinical research. Frontiers Med 2014;8:310-315.

Chinese medicine academic experience inheritance of

(Received February 5, 2015) Edited by GUO Yan

global design examples of the application of data mining. J

Big data is essential for further development of integrative medicine.

To give a short summary on achievements, opportunities and challenges of big data in integrative medicine (IM) and explore the future works on breakin...
182KB Sizes 0 Downloads 11 Views