Medical Reference Services Quarterly, 34(2):190–201, 2015 Published with license by Taylor & Francis ISSN: 0276-3869 print=1540-9597 online DOI: 10.1080/02763869.2015.1019747

Capturing Citation Activity in Three Health Sciences Departments: A Comparison Study of Scopus and Web of Science ALEXANDRA SARKOZY Purdy Kresge Library, Wayne State University, Detroit, Michigan, USA

ALISON SLYMAN and WENDY WU Shiffman Medical Library, Wayne State University, Detroit, Michigan, USA

Scopus and Web of Science are the two major citation databases that collect and disseminate bibliometric statistics about research articles, journals, institutions, and individual authors. Liaison librarians are now regularly called upon to utilize these databases to assist faculty in finding citation activity on their published works for tenure and promotion, grant applications, and more. But questions about the accuracy, scope, and coverage of these tools deserve closer scrutiny. Discrepancies in citation capture led to a systematic study on how Scopus and Web of Science compared in a real-life situation encountered by liaisons: comparing three different disciplines at a medical school and nursing program. How many articles would each database retrieve for each faculty member using the author-searching tools provided? How many cited references for each faculty member would each tool generate? Results demonstrated troubling differences in publication and citation activity capture between Scopus and Web of Science. Implications for librarians are discussed.

# Alexandra Sarkozy, Alison Slyman, and Wendy Wu Received: December 1, 2014; Revised: February 5, 2015; Accepted: February 6, 2015. This article was based on a poster presented at the Annual Meeting of the Medical Library Association, Chicago, May 20, 2014. Address correspondence to Alexandra Sarkozy, Purdy Kresge Library, Wayne State University, 5265 Cass Avenue, Detroit, MI 48202. E-mail: [email protected] Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/wmrs. 190

Capturing Citation Activity in Health Sciences Departments

191

KEYWORDS Author identifiers, author name disambiguation, bibliometrics, comparison study, library liaisons, ORCID, Scopus, Web of Science

INTRODUCTION Publication and citation activity has become a metric by which governments, research institutions, and individual faculty seek to understand the impact of research funding and academic projects. Governments are interested in understanding the impact of their research investments on the public good and benchmarking their scientific accomplishments against those of other countries. Research institutions and academic departments want to understand their research impact and their corresponding rankings against peers. Individual investigators seeking tenure, promotion, and grants need to know where their research is being cited and by whom. Two databases, Elsevier’s Scopus and Thomson-Reuters’ Web of Science, are the most commonly used bibliometric databases to track citations on scholarship. Faculty at a research university and medical school have increasingly asked liaison librarians in several different health sciences departments to track and analyze publication and citation activity for individuals and departments as part of their job duties. Anecdotally noticing substantial discrepancies between the results from Scopus and Web of Science, the liaisons were troubled by faculty and administrators reliance on citation activity as a measure of productivity. Comparisons of automatically generated results to the liaisons’ own searches showed that database limitations, primarily name ambiguities and the lack of a centralized name authority for researchers, were generating search results that required substantial correction by hand to ensure accurate publication and cited reference counts. This led the liaisons to design their own systematic study, using methods and relevant tasks that a liaison librarian might encounter, to compare results and analyze differences between the two tools. To wit, the intent was to quantify differences encountered anecdotally as part of the daily liaison activities, and to share this research with the wider health sciences library community, who also face similar difficulties at their own institutions. Google Scholar was specifically not included in the study due to a lack of inclusion criteria, and the inclusion of nonscholarly content. Do Scopus and Web of Science track citation activity swiftly and accurately enough that automated cited reference reports can be used without substantial correction and manual intervention? To answer this question, both Scopus and Web of Science’s capture of both the total publications of individual researchers in three different health sciences departments of a research university, as well as the cited references on those publications, were compared. Total publications and cited references were then

192

A. Sarkozy et al.

aggregated to see how Scopus and Web of Science compared in capturing total departmental output of these indicators. The analysis shows where differences occur between the two tools and offers reasons why the discrepancies occur. The study of these tools allows liaison librarians make informed decisions for library acquisitions and instruction, help researchers reach their tenure goals and get grants, and to better understand the strengths and limitations of both databases in providing citation information.

LITERATURE REVIEW A literature search of the library and information science literature revealed much relevant work on this topic. Previous comparisons of Scopus and Web of Science have shown differences in both article capture and cited reference counts between the two databases.1 Meho and Yang examined the performance of Scopus, Web of Science, and Google Scholar for capturing the research impact of 25 library and information science researchers, and found that Scopus and Google Scholar captured substantially more citation activity for researchers in the middle of the rankings than did Web of Science.2 Meho and Sugimoto extended this research to compare Scopus and Web of Science in assessing the measurement of impact calculated by each database when analyzed on the level of country, institution, journal, and research domain, and found significant differences between the two, mostly because of the inclusion of conference proceedings in Web of Science.3 Bibliometrics researchers have also analyzed Scopus, Web of Science, and Google Scholar for differences in research impact as measured by the h-index. Bar-Ilan found considerable differences between the three databases as in their 2008 study of Israeli researchers.4 DeGroote and Raszewski compared the h-indexes generated by the databases for nursing faculty at a large urban university and concluded that one tool alone could not be relied on to provide a thorough assessment of a researcher’s impact. Moreover, comparisons between researchers should be done only within a specified database.5 The search methodology employed can also influence the citation capture of both Scopus and Web of Science, as Markpin et al. showed in their 2012 examination of research performance of 33 Asian universities. Upon querying Scopus and Web of Science for research papers of Asian countries, they contended that the choice of database, data retrieval method, and bibliometric indictor chosen all strongly effected the ranking order of the universities studied.6 Another study found that discipline matters when choosing a source of bibliographic citation information. Torres-Salinas et al. found significant differences in the capture of both total publications as well as citations in clinical science fields between the two databases; works received 14.7% more citations in Scopus than in Web of Science.7

Capturing Citation Activity in Health Sciences Departments

193

METHOD Publication and citation activity were analyzed for three research departments at Wayne State University: the College of Pharmacy and Pharmaceutical Sciences (Pharmacy), the Department of Obstetrics and Gynecology (OB-GYN), and the College of Nursing (Nursing). Web of Science and Scopus were searched for total publications, and cited references on those publications, using the author profiles generated by each database. Accuracy of the results captured by each database was not verified (i.e., whether a publication was written by a given researcher or someone else with the same name). Google Scholar was not included in this study. While Google Scholar contains more conference proceedings, nonscholarly citations, and international citation activity, it was not included because of the lack of publicly stated inclusion criteria, the lack of transparency about calculations of impact, and the fact that many scholarly journals are not indexed.8–10 The searches were performed in fall 2013. The user interface of Web of Science has changed since then, so this study’s strategy does not align with currently available search options, but rather those available when results were collected. Departments were selected based on liaison areas, and three departments total were sampled. Scopus and Web of Science were searched by author and affiliation to create citation profiles for each faculty member of the departments of Pharmacy, OB-GYN, and Nursing at Wayne State University. Twenty-five researchers from Nursing and OB-GYN and 28 from Pharmacy were searched for a total 78 researchers included in the study. The search strategy in Scopus used the author search function. Researcher last name and first initial were entered. Additionally, since many institutions may have a ‘‘Smith, R,’’ for example, ‘‘Wayne State’’ was searched as the affiliation. The appropriate author profile was selected from those retrieved. A similar strategy was employed in Web of Science. The author tab was searched by entering the researcher last name and first initial. Wayne State University affiliated researchers were selected from the choices. Using the search results from each database, the total number of articles returned by the database was gathered, as well as the total number of cited references for each researcher. Overall accuracy of any of the search results was not evaluated, nor were author names disambiguated by hand. The goal was to compare the results of the two databases. The major limitation of this study is an account of accuracy. Accuracy of the results from each database compared to the gold standard of researcher CVs was only assessed for a small sample of faculty. Due to privacy concerns in several departments, most CVs could not be obtained, and the study focused on comparing the two databases against each other.

194

A. Sarkozy et al.

RESULTS The total number of articles retrieved, in aggregate by database, for Nursing, Pharmacy, and OB-GYN are shown in Figure 1. Scopus retrieved more articles than Web of Science for OB-GYN (2,912 and 2,677, respectively) and Pharmacy (1,516 and 746) researchers, but fewer for Nursing (1,020 and 935) faculty. Scopus retrieved over twice the number of articles for Pharmacy faculty than Web of Science. A similar picture emerged for cited reference capture (see Figure 2). For OB-GYN and Pharmacy faculty, Scopus retrieved 58,284 cited references for OB-GYN faculty compared with 37,345 for Web of Science. For Nursing, Web of Science retrieved 18,293 cited references; Web of Science 13,193. Scopus retrieved nearly triple the number of cited references for Pharmacy: 25,862 to Web of Science’s 9,335. While the aggregate numbers of both articles and cited references captured by the databases show substantial differences, it is in the detailed capture of differences for each individual researcher where distinct differences in capture can be seen at a more granular level. Important differences in counts will be looked at for individual researchers in all three departments, and larger trends will be included in the discussion.

OB-GYN The difference in articles retrieved by the two databases for OB-GYN was 235, a 9% difference. Looking at differences by individual researcher in Figure 3 reveals that Scopus did not retrieve more articles for every researcher. Web of Science captured more articles for Researchers #3, 4,

FIGURE 1 Total articles retrieved by database.

Capturing Citation Activity in Health Sciences Departments

195

FIGURE 2 Total cited references retrieved by database.

8, 12, 16, 18, 20, 23, and 25. However, there were instances of dramatic differences in capture. For Research #2, Scopus found 173 articles while Web of Science returned only 5. For Researcher #13, Scopus retrieved 91 articles and Web of Science, 12. Similar discrepancies exist for Researchers #6 and 11, for which Scopus retrieved more than twice the number of articles than Web of Science.

FIGURE 3 Articles per researcher, OB-GYN.

196

A. Sarkozy et al.

These differences in article capture are magnified in the cited reference counts (see Figure 4). While Scopus retrieved 171 more articles than Web of Science for Researcher #2, Scopus retrieved 1,574 more cited references for Researcher #2 than Web of Science. Interestingly, for Researcher #3, for whom Scopus and Web of Science retrieved roughly the same number of articles (108 and 115, respectively), Scopus found 4,425 cited references and Web of Science only 425, a tenfold difference.

Pharmacy For Pharmacy faculty, Scopus retrieved many more articles for Researchers #5, 7, 8, 10, 11, 12, 15, 16, 17, 23, and 25 (see Figure 5). This is likely because Scopus indexes the pharmaceutical science-heavy Embase in its corpus, which is likely to include publications not available in Web of Science. The discrepancy in Pharmacy publications retrieved by the two databases led to substantial distortions in reported citation activity. In the cases of Researchers #5, 7, 8, 12, 15, 18, 23, 26, and 28, cited reference capture (see Figure 6) varies between the two databases, sometimes in the range of thousands of cited references per researcher.

Nursing For the large majority of Nursing faculty, Scopus retrieved more publications than Web of Science (see Figure 7). So why is the aggregate number of

FIGURE 4 Cited references per researcher, OB-GYN.

Capturing Citation Activity in Health Sciences Departments

197

FIGURE 5 Articles per researcher, pharmacy.

articles retrieved by Web of Science (1,020) larger than Scopus’s 935? Web of Science retrieved more than 300 more articles for Researcher #5 than Scopus, skewing the total count. A similar picture emerged for cited references. For most Nursing faculty, Scopus retrieved more cited references (see Figure 8). For Researcher #5, Web of Science retrieved 9,288 more cited references than Scopus, skewing aggregate totals.

FIGURE 6 Cited references per researcher, pharmacy.

198

A. Sarkozy et al.

FIGURE 7 Articles per researcher, nursing.

DISCUSSION Name ambiguity was responsible for a great deal of the inaccuracies in retrieval for both databases. Researchers may have a common name, or have changed their name, or used different initials in different publications, or have changed institutions. Because of this, for both Scopus and Web of

FIGURE 8 Cited references per researcher, nursing.

Capturing Citation Activity in Health Sciences Departments

199

Science, constructing reliable and complete lists of publications and citations for researchers based on name and institutional affiliation is problematic. Due to privacy concerns, a full complement of CVs could not be obtained for a comparison of the accuracy of database retrieval. However, a sample comparison of ten CVs volunteered from both Pharmacy and Nursing show that neither Scopus nor Web of Science picked up all CV publications in eight out of the ten samples. The uneven database indexing of conference proceedings and symposia listed on CVs was also problematic. Both databases misattributed articles to authors due to name ambiguity, and the mistakes were not the same for both databases. Scopus captured complete publication count for three out of five sample CV comparisons, and captured most faculty publications in Pharmacy. As a result, Pharmacy faculty may be more likely to embrace Scopus as a tool to capture their scholarly output. Further investigation at the article level would generate insights into the reasons for database reporting discrepancies, whether due to name ambiguity, incomplete literature coverage, or other reasons. The fact that hours of labor must be spent disambiguating results for each individual researcher to get an accurate picture of citation activity is very much the problem that this study points out. Upon closer investigation of the publication retrieval results for Nursing faculty, name ambiguity for Researcher #5, to whom Web of Science attributed hundreds more publications than the researcher actually wrote, was the reason why a higher number of total publications was returned. This misattribution demonstrates in concrete form the problems that the lack of a researcher name authority in scientific databases can cause. Accurate citation metrics require accurate, disambiguated author names in their source data, or errors like this will occur.

CONCLUSION Attempts to accurately gauge research output and impact from author searches in Scopus and Web of Science will require many hours of human intervention to get a complete, accurate picture of citation activity. While Scopus and Web of Science offer easy ‘‘author search’’ capabilities and create citation profiles at the push of a button, caution should be exercised when interpreting the results of any such search as a measure of research impact. Additional institutional tools like Elsevier’s SciVal and Thomson-Reuters’ InCites and Converis that perform further analyses of research impact assume the accuracy of the author profiles in Web of Science and Scopus. This preliminary study showed that author profiles in these tools are problematic, with name ambiguity causing the bulk of problems, and will require institutions and researchers manually to assess and correct author profiles reported by these databases to ensure accurate measurements of research impact.

200

A. Sarkozy et al.

This study has several implications for health sciences librarians. First, health sciences librarians need to educate faculty and administrators about the limitations of these tools. The trend toward relying on data and the creation of systems for citation reporting data is growing, and any reports for decision making need to be based on accurate underlying information. This study shows that this is not the case. Second, health sciences librarians can further press for the adoption of universal research identifiers such as ORCID. ORCID adoption at the institutional level would clear up much of the name ambiguity and problems tracking institutional affiliation that leads to reporting discrepancies.11 Lastly, this research raises awareness about the need to critically evaluate the methods and tools used to assess research impact in liaisons’ own environments, and the role of the library in the process.

REFERENCES 1. Powell, K. ‘‘Measuring Nursing Faculty Impact: Web of Science versus Scopus.’’ Contributed poster presented at the Annual Meeting of the Medical Library Association, Chicago, IL, May 2014. 2. Meho, Lokman I., and Kiduk Yang. ‘‘Impact of Data Sources on Citation Counts and Rankings of LIS Faculty: Web of Science Versus Scopus and Google Scholar.’’ Journal of the American Society for Information Science and Technology 58, no. 13 (2007): 2105–2125. doi: 10.1002=asi.20677 3. Meho, Lokman I., and Cassidy R. Sugimoto. ‘‘Assessing the Scholarly Impact of Information Studies: A Tale of Two Citation Databases – Scopus and Web of Science.’’ Journal of the American Society for Information Science and Technology 60, no. 12 (2009): 2499–2508. doi: 10.1002=asi.21165 4. Bar-Ilan, Judit. ‘‘Which h-index?—A Comparison of WoS, Scopus and Google Scholar.’’ Scientometrics 74, no. 2 (2008): 257–271. doi: 10.1007=s11192-008-0216-y 5. De Groote, Sandra L., and Rebecca Raszewski. ‘‘Coverage of Google Scholar, Scopus, and Web of Science: A Case Study of the h-index in Nursing.’’ Nursing Outlook 60, no. 6 (November–December 2012): 391–400. doi: 10.1016=j.outlook. 2012.04.007 6. Markpin, Teerasak, Nongyao Premkamolnetr, Santi Ittiritmeechai et al. ‘‘The Effects of Choice of Database and Data Retrieval Methods on Research Performance Evaluations of Asian Universities.’’ Online Information Review 37, no. 4 (2013): 538–563. doi: 10.1108=OIR-04-2012-0050 7. Torres-Salinas, Daniel, Emilio Delgado Lopez-Co´zar, and Evaristo Jime´nezContreras. ‘‘Ranking of Departments and Researchers within a University Using Two Different Databases: Web of Science versus Scopus.’’ Scientometrics 80, no. 3 (2009): 761–774. doi: 10.1007=s11192-008-2113-9 8. Jacso, Peter. ‘‘Pragmatic Issues in Calculating and Comparing the Quantity and Quality of Research through Rating and Ranking of Researchers Based on Peer Reviews and Bibliometric Indicators from Web of Science, Scopus and Google Scholar.’’ Online Information Review 34, no. 6 (2010): 972–982. doi: 10.1108= 14684521011099432

Capturing Citation Activity in Health Sciences Departments

201

9. Jacso, Peter. ‘‘Google Scholar Metrics for Publications: The Software and Content Features of a New Open Access Bibliometric Service.’’ Online Information Review 36, no. 4 (2012): 604–619. doi: 10.1108=14684521211254121 10. Levine-Clark, Michael, and Esther L. Gil. ‘‘A Comparative Citation Analysis of Web of Science, Scopus, and Google Scholar.’’ Journal of Business & Finance Librarianship 14, no. 1 (2009): 32–46. doi: 10.1080=08963560802176348 11. Gasparyan, Armen Yuri, Nurbek A. Akazhanov, Alexander A. Voronov, and George D. Kitas. ‘‘Systematic and Open Identification of Researchers and Authors: Focus on Open Researcher and Contributor ID.’’ Journal of Korean Medical Science 29, no. 11 (November 2014): 1453–1456. doi: 10.3346=jkms.2014.29.11.1453

ABOUT THE AUTHORS Alexandra Sarkozy, MSI ([email protected]) is Learning and Research Support Librarian, Purdy Kresge Library, Wayne State University, 5265 Cass Avenue, Detroit, MI 48202. Alison Slyman, MLIS ([email protected]) is Learning and Research Support Librarian and Wendy Wu, MS, AHIP ([email protected]) is Learning and Research Support Librarian; both at Shiffman Medical Library, Wayne State University, 320 E. Canfield, Detroit, MI 48201.

Copyright of Medical Reference Services Quarterly is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Capturing citation activity in three health sciences departments: a comparison study of Scopus and Web of Science.

Scopus and Web of Science are the two major citation databases that collect and disseminate bibliometric statistics about research articles, journals,...
661KB Sizes 0 Downloads 7 Views