J Med Syst (2014) 38:37 DOI 10.1007/s10916-014-0037-x

SYSTEMS-LEVEL QUALITY IMPROVEMENT

MIRASS: Medical Informatics Research Activity Support System Using Information Mashup Network M. L. M. Kiah & B. B. Zaidan & A. A. Zaidan & Mohamed Nabi & Rabiu Ibraheem

Received: 14 November 2013 / Accepted: 13 March 2014 / Published online: 4 April 2014 # Springer Science+Business Media New York 2014

Abstract The advancement of information technology has facilitated the automation and feasibility of online information sharing. The second generation of the World Wide Web (Web 2.0) enables the collaboration and sharing of online information through Web-serving applications. Data mashup, which is considered a Web 2.0 platform, plays an important role in information and communication technology applications. However, few ideas have been transformed into education and research domains, particularly in medical informatics. The creation of a friendly environment for medical informatics research requires the removal of certain obstacles in terms of search time, resource credibility, and search result accuracy. This paper considers three glitches that researchers encounter in medical informatics research; these glitches include the quality of papers obtained from scientific search engines (particularly, Web of Science and Science Direct), the quality of articles from the indices of these search engines, and the customizability and flexibility of these search engines. A customizable search engine for trusted resources of medical informatics was developed and implemented through data mashup. Results show that the proposed search engine improves the usability of scientific search engines for medical informatics. Pipe search engine was found to be more efficient than other engines. This article is part of the Topical Collection on Systems-Level Quality Improvement M. L. M. Kiah : M. Nabi Faculty of Computer Science and Information Technology, University Malaya, 50603 Kuala Lumpur, Malaysia B. B. Zaidan : A. A. Zaidan (*) Faculty of Engineering, Multimedia University, Jalan Multimedia, Cyberjaya 63100, Selangor Darul Ehsan, Malaysia e-mail: [email protected] R. Ibraheem Department of Computer and Information Sciences, University Teknologi PETRONAS, Tronoh 31750, Perak, Malaysia

Keywords Data mashup . Social networks . Medical informatics . Knowledge sharing . Search engines . ISI journals

Introduction Web 2.0 is difficult to define; Giustini maintains that “what seems clear is that Web 2.0 brings people together in a more dynamic and interactive space” [1]. Hansen stated that “Web 2.0 is a term which refers to improved communication and collaboration between people via social-networking technologies” [1]. The increased popularity of Web 2.0 has introduced new technologies, such as Health 2.0 or Medicine 2.0 [2]. In a study of junior physicians who use Web 2.0 in their clinical practice, 70 to 80 % of the physicians use Google and Wikipedia regardless of the information credibility of the Web 2.0 content [3]. However, the use of Web 2.0 generates a number of problems [4]. For instance, the search by consumers for health information may yield unnecessary and unreliable information disseminated by extraneous issues and useless sites. Moreover, without the help of health professionals, a consumer might “get lost” and obtain wrong or misleading information [3, 5]. A number of Web applications have disappeared quickly while taking user data with them, thus creating a problem of availability of Web applications. Health 2.0 is used whenever the technologies related to Web 2.0 are involved in health care [6]. Therefore, computer science researchers collaborate with biomedical researchers in the new era of health care [7]. Health 2.0 or Medicine 2.0 does not have a specific definition; however, Eisenach has stated that Health 2.0 refers to “services or applications for health care consumers, patients, professionals, and biomedical researchers who use Web 2.0 technologies.” Health 2.0 relies on Web 2.0 technologies and thus “inherits” several Web 2.0 problems, such as problems pertaining to the reliability of

37, Page 2 of 15

Web-based health information [8]. Moreover, certain users who use Web-based technologies for self-care encounter problems concerning the user-friendliness of the self-care applications, the quality of care provided by the applications, and the implementation of the applications in clinical practice [9]. Health 2.0 must overcome the hierarchical and closed structures of current health systems and develop into a better health system that insists on collaboration, openness/transparency, and participation. Mashup is a Web 2.0 application that uses and combines data from two or more sources to form a new service by using Application Programming Interfaces (API), Web feeds, and Web services, among others. Mashup is generally used in businesses, marketing, and advertisements. However, mashup has recently been introduced and groomed to education housing, which is an example of mashup application. Using more sophisticated tools for achieving data mash up may reduce the complexity involved in extracting the required information in any medical informatics environment [10, 11]. Mashup can also be developed to solve library science application problems. This application (using mashup) has been applied in the university information resource center of University Teknologi Petronas. A rapid way to look for information on the web is to use a search engine such as Google. The results, however, are a list of suggested HTML pages devoid of context and semantics and requiring human interpretation. This creates new terminologies to accommodate this development. Web of things is one of these terminologies. Web of Things is about re-using the Web standards to connect the quickly expanding ecosystem into everyday smart objects. Well-accepted and understood standards and blueprints (such as URI, HTTP, REST, Atom, etc.) are used to access the functionality of the smart objects. An example of Web of things is web Mashup. Web Mashup are Web applications generated by combining content, presentation, or application functionality from disparate Web sources. They aim to combine these sources to create useful new applications or services. Content and presentation elements typically come in the form of RSS or Atom feeds, various XML formats, or as HTML, Shock Wave Flash (SWF), or other graphical elements [12]. Mashup are applications that reuse and combine data and services available on the web. They are developed in a rapid, ad-hoc manner to automate processes and remix information. This enables the users to explore information in new ways and can save valuable time that may have been lost in laborious routine tasks [13]. Below are some of the characteristic behind the huge consideration to the Mashup applications: 1- Mashup is a web 2.0 application: allow interaction between the users, users-web in different menners 2- Mashup is situational application: the term situational application is used about an application that is created for a narrow group of users with unique needs [14]

J Med Syst (2014) 38:37

3- Mashup reduce time-cost: this feature is done by aggregating multi-source in one application or one source 4- Mashup is End-User Programming: In many cases, all of the data and services needed to accomplish a goal already exist, but are not in a form amenable to an end-user [15], while mashup editers make this future visable 5- Mashup resolve the problem of interoperability and integration of different sources [16]. 6- Mashup visualization: it can represent the data in visual manner such as goi-map

Research objectives This paper is designed to achieve the following objectives: 1. To examine the feasibility of optimizing the results of recent scientific search engines 2. To evaluate the quality of articles and so-called “trusted resources” 3. To ascertain the usefulness and usability of customizable scientific search engines 4. To implement and develop a customizable search engine for trusted resources of Medical Informatics based on data mashup 5. To provide a step-by-step guide for data mashup implementation toward further enhancement

Quality of search results The review of relevant literature is always performed at early stage of any scientific research. According to a survey conducted by [17], 83 % of the respondents use Google Scholar and 13 % do not but are inclined to try. The survey also noted that Google Scholar is favored for its user-friendliness and speed. Table 1 shows that respondents who prefer Web of Science (WOS) is more confident about the quality of the results than those who prefer Google Scholar. The search was divided into direct input keyword search and customized search. In the direct input keyword search, entering the keyword “Electronic Medical Records Security” generated 298 articles from WOS as shown in Fig. 1a, 198,000 articles from Google Scholar as shown in Fig. 1b, and 9,307 articles from Science Direct as shown in Fig. 1c. Entering the keyword “Electronic medical record security” (no “s” in the word “records”) yielded 298 articles from WOS, 318,000 articles from Google Scholar, and 9,307 articles from Science Direct. When the keyword “EMR Security” was used, the results were 36 articles from WOS, 13,600 articles from Google Scholar, and 780 articles from Science Direct. When

J Med Syst (2014) 38:37

Page 3 of 15, 37

Table 1 Comparison of / comparative findings of the three search engines Type of search

The keyword

Results in WOS

Results in Google Scholar

Results in Sciencedirect

Direct input keyword search

Electronic medical records security Electronic medical record security EMR security Electronic health records security Electronic medical records security Options available

298

198,000

9,313

298

318,000

9,313

36 376

13,600 260,000

780 10,781

37

15,500

9,756

• Articles • Review • Editorial • News • Proceeding papers • Authors • Group of authors • Editors • Source titles • Book series titles • Conference titles Need for subscription to the journal

• Null

• Journal • Book • Reference work

• Null

• Journal/book title • Topic

Customized search

By year 2012 By type of article

Other options

Options available

Availability of full text articles Availability to users Overall

Fair number of articles are available Need for subscription to the journals and Free access WOS search engine • Better search result • Low quality search • Fair customization result • Reliable (high quality) papers • No customization • Availability subject to subscription • Questionable (low) • Full text not available quality papers • More search options • Available freely • Fair full text is available

the keyword “Electronic Health Records Security” was used, the results were 376 articles from WOS, 260,000 articles from Google Scholar, and 10,781 articles from Science Direct. In the second type of search, various customizations were implemented in the searching criteria. In search by year 2012, when the keyword “Electronic medical records security” was entered, the results were 37 articles from WOS, 15,500 from Google Scholar, and 9,756 from Science Direct. When “type of article search” was performed, the options available in WOS included articles, reviews, editorials, news proceedings,

Available Need for subscription to Sciencedirect • Better search result • Fair of customization • High quality papers • Availability subject to subscription • Full text available • Search options less than WOS

and papers; no options were available in Google Scholar; and journals, books, and reference work options were available in Science Direct. WOS had other search options, such as authors, group of authors, editors, source titles, book series titles, and conference titles. No other search options were available in Google Scholar. The options in Science Direct included journals, books, titles, and topics. WOS requires subscription for the availability of full-text articles. A fair number of articles are available in Google Scholar, whereas all articles are available in Science Direct. With regard to availability to

Fig. 1 Results obtained search for the keyword “Electronic medical records security” from a Web of Science, b Google Scholar and c Science Direct

37, Page 4 of 15

users, WOS and Science Direct require subscription, whereas Google Scholar offers free access. In terms of overall performance, WOS yielded good search results, fair customization, and credible (high-quality) papers. However, the availability of a paper in WOS is subject to subscription, that is, a full-text article is not available without subscription. In terms of overall performance, Google Scholar had low quality and allowed no customization. However, Google Scholar offered a fair number of full-text journals that were freely available. Finally, in terms of overall performance, Science Direct yielded good search results, fair customization, and reliable (high-quality) papers. However, full-text articles in Science Direct are only available via subscriptions, and search options are available in Science Direct compared with WOS.

Quality of articles The only advantage of Google Scholar over ISI and Scopus is that Google Scholar is free. According to [17] in his special session on the occasion of 50 years of citation indexing, Google Scholar displays many confusing results. In addition, the Google Scholar system does not review articles, that is, Google Scholar considers any document file that has a bold title followed by a list of names and references at the end as an article. Availability and accessibility would not exonerate Google Scholar from being an “untrusted resource” in scientific research. ISI-WOS and Scopus require subscription, making them inaccessible, whereas Google Scholar is free. In conclusion, ISI-WOS follows the highest standards in maintaining the quality of their journals. The following issues have been reported in texts: 1. Ease of use (easier usage, the more frequently used) 2. Search speed 3. Quality of resources (Do you feel confident in your research resources?) 4. Customizability of the search engine (Can you optimize the search engine results? Can you search for a specific journal?)

Introduction of data mashup in medical informatics Web 2.0 is considered capable of bringing “people together in a more dynamic and interactive space.” Therefore, we believe that Web 2.0 could help overcome recent problems and [1, 6, 9]. In the next section, data mashup is introduced as the key solution to the described problems.

J Med Syst (2014) 38:37

Need for customizing scientific search engines Customization and personalization are the (future) keys to user computing, which is related to the user interface (UI). The success of an application depends on its usability, which relies on the UI design. Therefore, the customization of a trusted resources application must ensure its usability. This paper investigates a number of search engines and presents the results. The results show that mashup is the solution to the current customization problem. Data mashup Data mashup allows users to compile new information that the data were not originally meant for. This capability shows that mashup development is a systematic way that allows the use and reuse of the data available in the Internet. This application is rapidly being adopted in an ad-hoc fashion. Early work on mashup search engine A few Mashup search engines in the area of medical informatics and bioinformatics have defined in the literature. One of the earlier works is done in [18], they implemented Bio2RDF which is a system, built from rdfizer programs written in JSP. Via Bio2RDF, documents from public bioinformatics databases such as Kegg, PDB, MGI, HGNC and several of NCBI’s databases can now be made available in RDF format. The Bio2RDF project has successfully applied the semantic web technology to publicly available databases by creating a knowledge space of RDF documents linked together with normalized URIs and sharing a common ontology. Bio2RDF is based on a three-step approach to build Mashup of bioinformatics data. Other applications of Mashup in the area of bioinformatics are applied to the area of gens search, however, none of these articles has reported any integration of selected resources (trusted resource) that combine medical informatics journals from different indexing databases and make it searchable. In this research, a configurable and programmable search engine can help researchers to identify the resource, filter, process, and search within the selected resource in more dynamic way. This search engine can further accommodate addition requirements (identified by the user) such as adding more resources, implement alerts, embedding results into one pool, identify other processes and etc. Few studies, particularly in higher education, have investigated mashup. Nevertheless, some authors have defined the concept of mashup in higher education [19]. Thus, we understand that most of the authors who refer to mashup in higher education are referring to the mashup for social networking, mostly for university Web sites and for connecting those sites with other services, such as YouTube, Flickr, Picasa,

J Med Syst (2014) 38:37

Facebook, and other social platforms. The definition of mashup in [19] elaborates on the usage of each area in higher education. In their work, they defined how mashup could be used in higher education.

Deployment of data mashup-based search engine for medical informatics This section consists of two parts. The first part involves the ideology of the study and the methods used in evaluating resources. The second part involves the technical aspect, including a few sections on search engine deployment. Resource selection WOS or ISI journals are evaluated in a conscientious and comprehensive manner. In particular, ISI conducts an extensive evaluation process, which generally takes a year or longer, before accepting a journal as credible. ISI evaluates the general quality of articles (well-written abstracts, accurate reference lists, and dependable data/figures). The integrity of the editorial board and the credentials of the authors who publish in the journal comprise other evaluation criteria. The caliber of the editorial board members and authors is judged mainly on the basis of the number of articles that the members and authors have already published and how well cited these articles are. The timeliness of publication is another major issue in the evaluation. If a journal is not published on time, this journal would most likely be rejected and the author/s would have to wait for 2 years before being eligible for reevaluation [20]. The evaluation of Scopus journals is based on nearly the same criteria; however, in contrast to ISI, a long list of arguments follows in Scopus. ISI-WOS has been used in different university ranking systems, making it the most accurate resource. The performance (in terms of quality, inclusion, monitoring, indexing, analysis, and search engine) of both indexing systems is not in question. In Google Scholar, publishers are required to design their systems to fulfill certain requirements and have their journals indexed [21] (Technical Criteria). A tip to have an article indexed by Google Scholar is to set up the journal with the use of “Open Journal System,” D-space, or E-print prior to submission. Articles can be indexed manually in Google Scholar by using the direct input page. As the index of Google Scholar is a matter of system configuration, a programmer can design a Web site with metadata to be indexed in Google Scholar. In this paper, 23 highly ranked journals were selected to feed the pipeline of the search engine as shown in Table 2. Any search placed using this pipeline would obtain the results out of these journals. The selection of these journals was based on the area they discussed, mainly medical informatics, and the impact factor of these journals.

Page 5 of 15, 37

Search engine deployment The second part consists of a few sections and steps for mashup development, which includes few activities. Parts of these activities include creating the piped result, building the application, and generating content to be used as guide. Videos recorded for pipe development were created to ensure the usability of the created application and for future enhancements. Few applications for creating mashup exist, including Yahoo Pipes, Sarena, IBM Mashup Center, and Microsoft Popfly. Ibrahim and Oxley [23] used Yahoo Pipes in their study. Yahoo Pipes is also recommended for mashup development in a higher education environment. The study by [19] focused on mashup in higher education and Library Science. These studies highlighted few areas on the potential of mashup application in higher education. The authors suggested the following five areas with mashup potential: & & & & &

Teaching and learning Research activity Library Administration Security

Considering these areas, we focused our study on research activity. The current study adopted the methodology used by [19] named “The proposed development methodology for higher education and library.” This development methodology consists of all the components necessary for mashup development in higher education circles. Figure 2 shows a step-bystep guide of this methodology. After conducting studies on mashup development with the use of pipes, we decided to utilize the methods used and created by [19] for mashup development in higher education, that is, a library, which is one of the focused areas in their research, as shown in Fig. 3. The steps proposed by [24] and adopted in the current research are as follows: 1) Accessing or Obtaining the Requirements In this phase, we clarified if mashup could be the solution to the problem [1, 6, 9]. Ibrahim [23, 24] have stated the need for additional features and/or requirements. Thus, the requirements have to be clearly obtained. Accordingly, the proposed search engine should meet two main requirements before deployment to achieve the objective/target. First Requirement: Trusted Resources The first requirement is to search within trusted resources only. The definition of “trusted resources” is yet to mature in view of the number of articles published almost daily. Therefore, we investigated the definition of

37, Page 6 of 15

J Med Syst (2014) 38:37

Table 2 Selected journals based on the citation report of 2011 [22] Quartile in category

Rank Abbreviated journal title

IF

Tier 1

1 2

4.409 http://www.jmir.org/ 3.609 http://jamia.bmj.com

3

Tier 2

Tier 3

4 5 6 7 8 9 10 11 12 13 14 15 16 17

Tier 4

18 19 20 21 22 23

J MED INTERNET RES J AM MED INFORM ASSN STAT METHODS MED RES INT J MED INFORM MED DECIS MAKING IEEE ENG MED BIOL MED BIOL ENG COMPUT STAT MED J BIOMED INFORM IEEE T INF TECHNOL B METHOD INFORM MED COMPUT METH PROG BIO BMC MED INFORM DECIS INT J TECHNOL ASSESS ARTIF INTELL MED J EVAL CLIN PRACT J MED SYST HEALTH INFORM J INFORM HEALTH SOC CA BIOMED TECH CIN-COMPUT INFORM NU HEALTH INF MANAG J J CANCER EDUC

Journal URL

2.443 http://smm.sagepub.com/ 2.414 2.329 2.057 1.878 1.877

http://www.ijmijournal.com/ http://mdm.sagepub.com/ http://www.embs.org/ http://www.ieedl.org/MBEC http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1097-0258

1.792 1.676 1.532 1.516

http://www.journals.elsevier.com/journal-of-biomedical-informatics/ http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=4358869 http://www.schattauer.de/en/magazine/subject-areas/journals-a-z/methods.html http://www.sciencedirect.com/science/journal/01692607

1.477 http://www.biomedcentral.com/bmcmedinformdecismak/ 1.365 1.345 1.229 1.132

http://journals.cambridge.org/action/displayJournal?jid=THC http://www.aiimjournal.com/ http://www.blackwellpublishing.com/journal.asp?ref=1356-1294 http://www.springer.com/statistics/life+sciences,+medicine+%26+health/journal/ 10916 1 http://jhi.sagepub.com/ 0.872 http://www.researchgate.net/journal/1753-8157_Informatics_for_Health_and_ Social_Care 0.855 http://www.degruyter.com/view/j/bmte 0.831 http://journals.lww.com/cinjournal/pages/currenttoc.aspx 0.824 http://himaa2.org.au/HIMJ/ 0.762 http://link.springer.com/journal/13187

“trusted resources” in the area of medical informatics. As a result of our comparison, ISI journals, which were

Fig. 2 Steps of adaptive methodology

reviewed and indexed by Thomson Router, were found to be the most reliable journals that provide high-quality

J Med Syst (2014) 38:37

Page 7 of 15, 37

Fig. 3 Adopted development methodology for mashup

Start

Get the requirements Needs additional Requirements

Discard Mashup Development

No

Can Mashup do it?

Yes Investigate similar Mashup

Yes Modify if Needed

Adopt the Mashup

Any Mash-up that does the same thing?

Determine the potential Mashup editor

Decide which Mashup editor to use Categorize Data Source

Define Data Source

No Fix problem

End

Deploy the Mashup

articles [25]. Thus, we selected ISI journals as the “trusted resources in medical informatics.” Second Requirement: Flexibility of Further Development in the Coverage In [25], the authors mentioned other databases, such as PubMed and Scopus, which index medical informatics journals further, thereby increasing the number of expected trusted resources. In some studies, the authors narrowed their focus to high-impact journals to acquire more reliable information [26]. In this case, more customizations in the search engine are necessary. Fig. 4 Yahoo Pipe interface

Yes

Define APIs

Define Presentation Layer

Register for APIs

Aggregate Sources

Mash -up Test Pass

Therefore, our search engine should be sufficiently customizable to meet these requirements. 2) Investigation on the Related Mashup Application We have highlighted that cloning the application can be used to develop a mashup system or applications easily and rapidly. Users are sometimes required to modify the cloned application to meet their requirements before the deployment of their application. Our investigation shows that a mashup deployed with search engines for trusted resources in medical informatics does not exist.

37, Page 8 of 15

3) Mashup Type Identification Sometimes, if the requirement is high, then possibly, a similar application cannot be observed. In such situation, a new application should be initiated from the beginning [27]. The author also recommends users to identify the type of mashup. The analyses of [23, 24, 27] are supportive and encouraging. Thus, we expect that the development of a search engine-based data mashup can be achieved. 4) Mashup Editor Recommendation This part deals with the selection of mashup editor after the identification of mashup type. Users must decide on which mashup the editor should use. The author of [27] recommended Yahoo Pipes as the editor for mashup in higher education. 5) Data Source Identification As stated in [27], the identification of the data source is necessary as part of the activities in this section. 6) Data Source Categorization Users should also categorize the data source. 7) API Definition Defining API is the next step after categorizing the data source as required by [27]. 8) API Registration API registration follows API definition. 9) Source Aggregation The process of source aggregation defined by [27] is adopted. 10) Definition of the Presentation Layer Defining the presentation layer, which is another step in this process, is used to present the application to users. 11) Testing and Debugging Testing and debugging are necessary for checking the application to ensure that the application meets the requirements. All pipes have been tested individually before combining them into one mashup. 12) Application Deployment Deployment of the created application, which is the last stage in this section, provides access to other users, ensuring that the application is built for users interested in similar topic or related fields. The new search engine is currently deployed locally. After publishing the current article, the search engine will be published online.

J Med Syst (2014) 38:37

Fig. 5 Fetch fed component

generates results that would be displayed on the basis of title, description, author, and publisher, among others. Editor interface The resources for the development of the proposed mashup system were identified and analyzed. When a source Web site did not contain feeds, third-party services were used to generate the missing feeds; otherwise, the development was straightforward. The process used in this study is described below. First, Yahoo Pipes is accessed from http://pipes.yahoo. com. Figure 4 shows the Yahoo Pipes page. Pipe creation Users are required to have a valid Yahoo user account to create a pipe by clicking on “Create a Pipe” page, as shown in Fig. 4. Component configuration The next step is to “drag and drop” the needed component on the area provider and create the desired configuration. The fetch feed component was used to retrieve all the RSS feeds from the defined location, as shown in Fig. 5. RSS feed identification Identification of resource location and RSS feed consists of two different implementations. First, if a resource Web site contains RSS feed, then using the “Find First Site Feed (was Fetch Site Feed)” in Yahoo Pipes would allow users to fetch the main site feeds and not the sub-directory. However, if a Web site does not contain an RSS feed, then this feature would

Implementation and deployment The system was customized/built using the Yahoo Pipes mashup editor, wherein all the listed trusted resources are used and combined to meet user expectations. In contrast to other search engines that generate unrelated results, the developed search engine allows users to make a search and then

Fig. 6 RSS feed identification

J Med Syst (2014) 38:37

Page 9 of 15, 37

Fig. 7 Connecting the fetch fed component to/with the output through the piped wire

not work. Figure 5 illustrates a case of resource location containing RSS feeds at the URL “www.jmir.org.” Figure 6 shows the RSS location for the resources in www.jmir.org, which is accessible from http://feeds.feedburner.com/ JMedInternetRes. After obtaining the feed(s) URL from the resource Web site location, the feed was linked to the fetch feed already added to the development area. After adding the URL, the fetch feed component was connected to/with the output through the piped wire; this step can also be achieved by “drag and drop” method, as shown in Fig. 7. Second, Feedity, which is an RSS generator, was used when a resource Web site does not have an RSS feed. Feedity can be accessed from “www.feedity.com.” Up to this stage, all the pipes of the 23 journals (“ISI journals of the Medical Informatics Field according to the citation 2011”) have been created. Figure 8 shows the “My Pipes” feature of Yahoo Pipes, which displays all the developed and cloned pipes. We simply “dragged and dropped” all the needed pipes in the development area.

Connecting the pipes to complete the configuration

Fig. 8 The created pipes for trusted resources in medical informatics

Fig. 9 Connecting the pipes

At this stage, we connected all the pipes together with pipe wires via “dragging and dropping.” Each fetch feed has one output; thus, we needed another component “union,” which is a multiple input with single output, to connect the pipes, as shown in Fig. 9.

Search engine configuration Two other components, namely, the filter and text-box components, are necessary at the final stage. The filter goes in-between the component union and the output. Figure 10 shows the rule set for filtering. We added a text box to allow user input in the search and in accordance with filtering rules. Figure 11 shows the combination of the pipes into one pipe; the pipes in this figure are at the final pipe application stage, which is used as the trusted resource search engine. After a phase of dedication and commitment, the product of the trusted resource took shape and was ready for use. A number of pipe applications were also developed to ensure that all the resources were collected. The results were then combined and presented into one pipe application that was configured to allow users to input a keyword in the search perimeter provided, as shown in Fig. 12.

37, Page 10 of 15

Fig. 10 The search function

Discussion We encountered a number of things related to the terms used by authors who defined mashup for higher education. We selected two trusted search engines, namely, WOS and Science Direct, to evaluate the proposed pipe search engine in terms of the five criteria described below. Customize search: This guide is intended for developers who want to programmatically create custom search engines and control how results appear on their pages. Therefore, custom search engines can be created to search across a specified collection of sites or pages by author name, title, or publication date. Creation of RSS feeds: This guide delivers rapidly changing content on Web-based applications. Adding an RSS feed of content from a journal Web site/database to other external Web sites is an efficient way of promoting the journal content (e.g., research articles and posts). In addition, all the resources selected by the researcher and fed into one Web site via RSS would make the researcher’s job easier. The researcher needs to check only one Web site to read all updates in the journals. Extension of journals out of database coverage: This guide allows the search engine to include journals out of database coverage (e.g., adding one or more journals from Science Direct to the set of medical informatics journals from WOS). This feature is required because several journals in medical informatics are not ISI-

Fig. 11 Configuration of the search parameters

J Med Syst (2014) 38:37

indexed. However, journals indexed by PubMed and some ACM journals are not ISI-indexed but are trusted resources in computer science. Therefore, the researcher must identify the database and the title of the journal s/he wants to search. Repeating this action every time during the period of a project is time-consuming. The new search engine allows researchers to easily configure their resources and feed it to one pool, which can be a Web site, blog, social networking site, or a pipe Web site. Accessibility: Accessibility refers to subscription. Most journals are completely digitized and available online, provided that users (or rather university libraries) have the appropriate subscriptions for publications by paying in advance and arranging access to an electronic mailing list or online service. In addition, users can contribute a certain amount of money to a particular fund, project, or charitable cause, on a regular basis. WOS and Science Direct require subscription from users to access their content. According to [17], Google Scholar is favored for its accessibility and speed; 83 % of the surveyed participants preferred Google Scholar over WOS and Science Direct. Our mashup-based search engine, which is currently at prototype stage, is more promising than other search engines because this engine promotes accessibility and provides quality results in addition to other features. Source editing: The source code of any established search engine or database is not available, whereas pipe API is editable and clone able. This flexibility would help researchers develop their own search portal/engine. The reliability of resources: Five elements must be considered to understand the reliability of the resources. The first is publishing source. If the source of the report or research is funded by an organization that has a vested interest in the results of the study, the report’s reliability may not be as great as otherwise. For example, a study on the health effects of tobacco smoking that is funded by the American Tobacco Institute, which is a research organization funded by the tobacco industry, may not be as reliable as a similar study funded by the non-partisan National Institute of Health. Another aspect to consider is the potential for conflict of interest between the report and the source of support or the publisher of the material. For example, all impact assessment studies that monitor the release of pollutants are conducted by the potential polluters themselves. Therefore, a conflict of interest exists between the information reported and the ones reporting it. These conflicts question the reliability of a study. The second is peer review, which entails that a study and its results must have been reviewed by a group of people with the necessary expertise to assess the merits of the work. Peer review is the system of “checks and balances” in science. Peer review occurs in two contexts.

J Med Syst (2014) 38:37

Page 11 of 15, 37

Fig. 12 The integration/combination of pipes

First, “factual” material may be reviewed when a Web site presents what is known about a medical condition. The second context is the publication of an original research or a review of original research. Before the results of a scientific research are published in a peerreviewed journal, persons trained and active in the same areas of research review the manuscript. Reviewers must ensure that the methods and statistical tests used in the study are appropriate and that the conclusions of the study are justifiable with respect to the data presented in the study. If a problem is found within the study or the

conclusions, the paper is not published. This review process ensures that only properly conducted research is presented in scientific journals. Peer review is subverted, however, if the reviewers have a vested interest in the results of the study they are reviewing. “Creation scientists” review each other’s work but do not have their work reviewed by biologists or earth scientists who may not have similar preconceived conclusions as they have. Peer review is crucial to establishing the reliability of information. Thus, knowing whether a source has been reviewed and by who are important aspects to consider

Table 3 A comparison of Web of Science, Science Direct, and Pipes Searches in terms of different criteria Criteria

Web of Science

Science Direct

Pipes Search

Customization of search Creation of RSS Extension of journals out of the database coverage Search coverage

✓ ✓ N/A

✓ ✓ N/A

✓ ✓ ✓

Only within the indexed articlea

Accessibility

Subscription is required

Adding journals and books from any database, journal is applicable Available for free

Source editing The reliability of the resources Embedding the result

N/A Reliable

The search includes indexed articlesb Subscription is required N/A Fare reliability

Applicable to embed the RSS of customized search Same as in WOS at social networking sites, blogs, and others. In the case of another customized search, another RSS will be generated

Cloneable and editable The resources is selective and thus it dependsc The result of the pipe can be embedded in social networking sites, blogs, and others

a

ISI journals take time (up to 6 months) to be indexed. During this period, users will not be able to find non-indexed articles in WOS although these articles are ISI-cited publications and published

b c

Science Direct has a faster indexing process, whereby an article with status “In Press, Corrected Proof” is available for search as well

In this research, the selected resources are reliable (ISI journals in the area of medical informatics). However, any researcher can add other resources that may be reliable or unreliable

37, Page 12 of 15

J Med Syst (2014) 38:37

Fig. 13 Sciencedirect database indexing example for (in press corrected proof)

when evaluating a source. The third element in establishing reliability is information about the author, that is, the manner by which we can assess the work of someone we do not know. An aspect we can judge to some degree is the credentials of an author. Does the author have expertise on the subject? Is he affiliated with institutions that represent experts on the subject and that may not have a vested interest in the results? The fourth is timeliness, which involves knowing when the information was obtained or reported. Scientific findings happen all the time, and thus, some information can be “yesterday’s news” very quickly. For example, a program on TV did a documentary on the Shroud of Turin, which is a religious relic believed by some to be Christ’s burial shroud. However, the program failed to mention, presumably because it was made more than several years ago, that with the Church’s permission, the shroud has been radiocarbon dated and has been determined to be from the 13th century

Fig. 14 The results of the search using the keyword “Security”

and therefore cannot be the relic it was thought to be. The fifth is additional references, which state that reliable sources of information provide references to the sources of information they are presenting. These references allow users to verify whether the original information is accurately conveyed. Overall, the selected resources meet the requirements of ISI-WOS, which is reliable as reported in [17]. The resources are determined by the researchers themselves if a demand to add and/or remove one or more resources is present. In addition, trusted resources are subject to the area, that is, ISI-WOS, ACM journals, Science Direct, and IEEE journals are great resources for computer science, whereas PubMed, WOS, and Science Direct are the selected resources for medical research. Embedding the result: Embedding the result is the ability of a system to support Web 2.0 and attract social networking sites and blogs, among others, by generating RSS from a Web site, search result, blog, and social

J Med Syst (2014) 38:37

Page 13 of 15, 37

Fig. 15 Extracting the RSS from the result of PIPE

media. RSS involves getting live Web feeds directly to the computers of users. RSS includes the latest headlines from different Web sites and pushes these headlines to the computers of users for quick scanning. Table 3 presents a comparison of WOS, Science Direct, and Pipe searches in terms of different criteria. The three search engines have been provided in the customizations of search and creation of the RSS criteria. In the extension of journals, out of the database coverage and source editing criteria, the WOS and Science Direct engines were not provided, whereas the pipe engine was clone able and editable. In the search coverage, accessibility, and reliability of the resources criteria, WOS searched within indexed article only, required subscription, and provided reliable resources; Science Direct included indexed articles in the search, required subscription, and provided resources with fare reliability; Pipes search engine added journals, books, and other materials from any database, where journal is applicable, was available for free, and provided selective resources as shown in Figs. 13 and 14. In embedding the result criterion, WOS and Science Direct embedded the RSS of customized search in social networking sites, blogs, and others. In the case of a new customized search, another RSS would be generated, whereas the results of the

Fig. 16 Extracting the RSS from the result of Web of Science

Pipes can be embedded in social networking sites, blogs, and others, as shown in Figs. 15, 16, and 17.

Future direction, recommendation, and open research issues Each researcher can now design his/her own list of journals (depending on their research) and connect them by using the presented framework. Mashup can be used in education, security, business, marketing, finance, and Web customization. Therefore, we will further investigate the potential of different types of mashup for medical informatics, such as Web customization mashup and process mashup. Other related topics that can be explored are as follows: 1- Advance Data Mashup for Medical Informatics Research The area of medical informatics includes several research categories. Thus, the development of different mashup is essential and recommended. A mashup network can also be implemented for various health and medical areas. 2- Apply Data Mining in Medical Informatics Research The availability of RSS in journal Web sites is another advantage of gathering a categorized text toward the development of data mining, text mining, and Web analysis approaches with data mashup.

37, Page 14 of 15

J Med Syst (2014) 38:37

Fig. 17 Extracting the RSS from the result of Science Direct

3- Develop Research Portals for Medical Informatics Research Research portals based on data mashup can help users focus more on a research by reducing search time. Searches in WOS or Science Direct yield thousands of journals, making the review and follow-up with newly published articles time-consuming. Most researchers also search for different topics (even with the researchers conducting studies in similar areas). Therefore, the implementation of a mashup network for each group at the level of individuals is recommended to ensure the quality of the observed articles, that is, the articles are updated with the latest publication in the area, and to reduce the time of search. 4- Text Analysis and Data Mining The use of data mining and text analysis to categorize articles and extract knowledge from the available text is an interesting research area that involves data mashup. Another aspect is to perform statistical analysis for the available text to obtain insight into the hot topics and the research focus of the current research industry. 5- Mashup search engine adoption IT acceptance and/or adoption are subject of numerous researches articles lately. Several theories have emerged that offer new insights into acceptance and use, at both the individual and organizational levels. One of these theories, the technology acceptance model (TAM) has received more attention. Several studies are subject to future investigation for the purpose of evaluating the acceptance of the users (students, researchers, lecturers, librarians). 6- Investigate different Mashup editors There are several Mashup editors available, each editor provide different services, a comprehensive investigation of the features of each particular editor might help to create a guideline for the developers and users towards developing more usable search engines.

own search engines depends on the individual needs. However, some of the journals websites need more improvements by facilitating direct RSS services to help researchers to create free-error mash-up based search engines. Other limitation in this research is the definition of reliable resources in a particular domain, in our research, web of science ISI-cited journals in the area of medical informatics is identified as the reliable resource for this domain, though, there might be other journals from other indexes can be also used as a reliable resource for this domain. In the future, other studies can suggest more comprehensive definition for the term reliable resource for any particular domain. Summary points What was already known? & & &

Data mashup is a web application that uses resources from more than one source to create a single new service. Data mashup plays an important role in information and communication technology (ICT) applications and it is used in businesses, marketing, and advertisements. Quality papers can be obtained from only few scientific search engines such as Web of Science and Science Direct. What this study added?

& & & &

Data mashup can be used in the education and research domains, particularly in medical informatics. A customizable search engine for trusted resources of medical informatics based on data mash up was developed and implemented. A proper search engine can improve the usability of scientific search engines for medical informatics. The result evaluation showed that the pipe search engine is more efficient compared with the other search engines.

Limitation of the study This study suggested creating an alternative for scientific search engines based on data mash up technology by reusing the available journals resources. The available mash up editors can help the researchers to create their

Conclusions Data mashup has obtained an increased usage in different areas of life. Recent studies have provided evidence on the

J Med Syst (2014) 38:37

development of information mashup for the academic and research life cycles. This study developed an information mashup network for medical informatics journals to achieve certain objectives, such as creating a search portal for medical informatics by using the data mashup concept. The selected resources were all ISI journals, which were “considered trusted resources for academic research.” In contrast to WOS or Science Direct, the developed search engine is extendable, that is, users can add any journal or resource out of the coverage. The second objective was to develop guidelines for researchers on how to develop their own mashup. A step-by-step guide has been reported in the “Implementation and deployment” section to make future developments easier. This research reported concise and systematic comparisons of different academic search engines. Several challenges and recommendations have been reported in the “Discussion” section. We believe that data mashup is important to the medical health society and other related sciences. Data mashup engages and motivates others to use the ideas generated for solutions to existing problems. Acknowledgments This Research has been funded by the High Impact Research unit (HIR) University of Malaya, under grant number UM.C/ HIR/MOHE/FCSIT/12.

Page 15 of 15, 37

10.

11.

12. 13.

14.

15.

16.

17.

18.

References 1. Giustini, D., How Web 2.0 is changing medicine. Br. Med. J. 333: 1283–1284, 2006. 2. Van De Belt, T. H., Engelen, L. J., Berben, S. A. A., and Schoonhoven, L., Definition of Health 2.0 and Medicine 2.0: A systematic review. J. Med. Internet Res. 12(2):e18, 2010. 3. Hughes, B., Joshi, I., Lemonde, H., and Wareham, J., Junior physician’s use of Web 2.0 for information seeking and medical education: A qualitative study. Int. J. Med. Inform. 78:645–655, 2009. 4. Isabel, T. D., López, C., and Joel, R., How to measure the QoS of a web-based EHRs system: Development of an instrument. J. Med. Syst. 36(6):3725–3731, 2012. 5. Eysenbach, G., Medicine 2.0: Social networking, collaboration, participation, apomediation, and openness. J. Med. Internet Res. 10(3): e22, 2008. 6. Hughes, B., Joshi, I., and Wareham, J., Health 2.0 and Medicine 2.0: Tensions and controversies in the field. J. Med. Internet Res. 10(3): e23, 2008. 7. Hesse, B. W., Hansen, D., Finholt, T., Munson, S., Kellogg, W., and Thomas, J. C., Social participation in health 2.0. Computer 43:45–52, 2010. 8. Adams, S. A., Revisiting the online health information reliability debate in the wake of “web 2.0”: An inter-disciplinary literature and website review. Int. J. Med. Inform. 79:391–400, 2010. 9. Nijland, N., Van Gemert-Pijnen, J., Boer, H., Steehouder, M. F., and Seydel, E. R., Evaluation of internet-based technology for supporting

19.

20.

21. 22.

23.

24. 25.

26.

27.

self-care: Problems encountered by patients and caregivers when using self-care applications. J. Med. Internet Res. 10(2):e13, 2008. Karla, P., and Gurupur, V., C-PHIS: A concept map-based knowledge base framework to develop personal health information systems. J. Med. Syst. 37(5):1–16, 2013. Steurbaut, K., Colpaert, K., Gadeyne, B., Depuydt, P., Vosters, P., Danneels, C., Benoit, D., Decruyenaere, J., and De Turck, F., COSARA: Integrated service platform for infection surveillance and antibiotic management in the ICU. J. Med. Syst. 36(6):3765– 3775, 2012. Jin, Y., Benatallah, B., Casati, F., and Daniel, F., Understanding mashup development. Internet Comput. IEEE 12(5):44–52, 2008. Grammel, L., and Storey, M.-A., A survey of mashup development environments. In: Chignell, M., Cordy, J., Ng, J., and Yesha, Y. (Eds.), The Smart Internet, vol. 6400. Springer, Berlin, pp. 137– 151, 2010. Salminen A., and Mikkonen, T., Mashups software ecosystems for the web era. IWSECO 2012, pp 18–32, 5th International Workshop on Software Ecosystems, 2012, http://ceur-ws.org/Vol-879/paper2.pdf. Wong, J., and Hong, J. I., Making mashups with marmite: towards end-user programming for the web. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. San Jose, California, USA, ACM: 1435–1444, 2007. Wong, J., and Hong, J., What do we “mashup” when we make mashups? Proceedings of the 4th international workshop on Enduser software engineering. Leipzig, Germany, ACM: 35–39, 2008. Hightower, C., and Caldwell, C., Shifting sands: Science researchers on Google Scholar, Web of Science, and PubMed, with implications for library collections budgets. Issues Sci. Technol. Librariansh. 4, 2010. Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., and Morissette, J., Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5):706–716, 2008. Ibrahim, R., and Oxley, A., Assessing the use of mash-ups in higher education. In: Mohamad Zain, J., Wan Mohd, W., and El-Qawasmeh, E. (Eds.), Software Engineering and Computer Systems, vol. 179. Springer, Berlin, pp. 278–291, 2011. Testa, J., The Thomson Reuters Journal Selection Process, 2012. Available: http://thomsonreuters.com/products_services/science/ free/essays/journal_selection_process/. Scholar, G., Inclusion Guidelines for Webmasters, 2012. Available: http://scholar.google.com/intl/en/scholar/inclusion.html. Journal Citation Report, (2011), http://admin-apps.webofknowledge. com/JCR/JCR?PointOfEntry=Home&SID= Y1knEtv7eIuQeVlDcsR. Ibrahim, R., Framework and model design for higher education mash-ups. In: International Conference on Computer Science & Information Sciences. Kuala Lumpur Convention Centre, Malaysia, 2012. Ibrahim, R., The potential for using mash-ups at a higher education. Res. J. Inf. Technol. 4:56–70, 2012. Miri, S. M., and Bahmani, P., Indexing in ISI Web of Sciences: The opportunities and threats. Jundishapur J. Microbiol. 5:381–383, 2012. El Emam, K., Arbuckle, L., Jonker, E., and Anderson, K., Two hindex benchmarks for evaluating the publication performance of medical informatics researchers. J. Med. Internet Res. 14:e144, 2012. Ibrahim, R., and Oxley, A., Proposed development methodology for higher education and library mash-ups. In: Information Technology (ITSim), 2010 International Symposium in, 2010, pp. 1–6.

MIRASS: medical informatics research activity support system using information mashup network.

The advancement of information technology has facilitated the automation and feasibility of online information sharing. The second generation of the W...
4MB Sizes 0 Downloads 3 Views