Journal of Telemedicine and Telecare http://jtt.sagepub.com/

How to improve your PubMed/MEDLINE searches: 1. background and basic searching Farhad Fatehi, Leonard C Gray and Richard Wootton J Telemed Telecare 2013 19: 479 originally published online 6 November 2013 DOI: 10.1177/1357633X13512061 The online version of this article can be found at: http://jtt.sagepub.com/content/19/8/479

Published by: http://www.sagepublications.com

Additional services and information for Journal of Telemedicine and Telecare can be found at: Email Alerts: http://jtt.sagepub.com/cgi/alerts Subscriptions: http://jtt.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav

>> Version of Record - Nov 29, 2013 OnlineFirst Version of Record - Nov 6, 2013 What is This?

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

EDUCATION & PRACTICE/Praxis

How to improve your PubMed/MEDLINE searches: 1. background and basic searching

Journal of Telemedicine and Telecare 19(8) 479–486 ! The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1357633X13512061 jtt.sagepub.com

Farhad Fatehi1,2, Leonard C Gray2,3 and Richard Wootton4,5

Summary PubMed provides free access via the Internet to more than 23 million records, of which over 19 million are from the MEDLINE database of journal articles. PubMed also provides access to other databases, such as the NCBI Bookshelf. To perform a basic search, you can simply enter the search terms or the concept that you are looking for in the search box. However, taking care to clarify your key concepts may save much time later on, because a non-specific search is likely to produce an overwhelming number of result hits. One way to make your search more specific is to specify which field you want to search using field tags. By default, the results of a search are sorted by the date added to PubMed and displayed in summary format with 20 result hits (records) on each page. In summary format, the title of the article, list of authors, source of information (e.g. journal name followed by date of publication, volume, issue, pages) and the unique PubMed record number called the PubMed identifier (PMID) are shown. Although information is stored about the articles, PubMed/MEDLINE does not store the full text of the papers themselves. However, PubMedCentral (PMC) stores more than 2.8 million articles (roughly 10% of the articles in PubMed) and provides access to them for free to the users. Accepted: 1 October 2013

Introduction PubMed is a web-based information retrieval system which was developed by the National Center for Biotechnology Information (NCBI) in the US. It provides free access to the widely used biomedical and life science database, MEDLINE. Because MEDLINE is the principal data source for PubMed, the two names (PubMed and MEDLINE) are commonly used interchangeably. However, PubMed also provides access to other databases, see Figure 1. PubMed provides free access to more than 23 million records, of which over 19 million are from MEDLINE. However, the size of this dataset brings its own problems. As the number of records in a database increases, it becomes more and more difficult for its users to find relevant information quickly and accurately. Indeed, a large number of search results can overwhelm the user. A study on PubMed users in 2008 showed that the average number of results produced was 13,798 per search.1 Poor formulation of search queries is an obstacle in searching electronic databases.2,3 The Google search engine undertakes more than half of all web queries. It is popular because of its simplicity, speed and convenience. However, the ordinary Google web search is unsatisfactory for scientific purposes because the results may include every piece of digital information that is available online, regardless of the scientific reliability of the website that has published it.

Google Scholar was therefore developed to meet the needs of the research community. This inherited Google’s main searching features, but was targeted at information from scientifically sound sources. It has useful features like the citation analysis (previously available only from Web of Science and Scopus at considerable cost) and the ‘‘Related articles’’ feature that will find documents similar to a selecte d article. A link to the full text is provided for most of the records through the institutional library that the user or the workstation is affiliated to, and various versions of each search result (from different online sources) are grouped together. Google Scholar can be viewed as complementary to PubMed. It provides access to a wide range of scientific information, much of which is non-peer reviewed.

1 School of Advanced Technologies in Medicine, Tehran University of Medical Sciences, Tehran, Iran 2 Centre for Online Health, University of Queensland, Brisbane, Australia 3 Centre for Research in Geriatric Medicine, University of Queensland, Brisbane, Australia 4 Norwegian Centre for Integrated Care and Telemedicine, University Hospital of North Norway, Tromsø, Norway 5 Faculty of Health Sciences, University of Tromsø, Tromsø, Norway

Corresponding author: Farhad Fatehi, Centre for Online Health, Level 2, Building 33, Princess Alexandra Hospital, Brisbane, Australia. Email: [email protected]

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

480

Journal of Telemedicine and Telecare 19(8)

Google Scholar can often be useful during a literature review to identify articles from the ‘‘grey’’ literature. However, it is not recommended for systematic reviews because the sources indexed by Google Scholar change over time, so it is not possible to repeat a search at a later date and obtain the same results. Perhaps the major advantage of Google Scholar over PubMed is the way that the search results are ranked in the results list. In order to meet user expectations and improve the efficiency of literature searching, the retrieval system and the interface of PubMed have been enhanced.4 Most of the changes concern the interface and do not affect the functionalities or search processing.5 An understanding of these changes and enhancements will ease the task of obtaining relevant information. A recent study on the search patterns of PubMed users showed that most users do not employ the advanced features of PubMed in performing their search tasks, which may indicate a lack of knowledge about the capabilities of PubMed.6 The aim of this series of articles is to provide an overview of PubMed/MEDLINE, to introduce some of the recent features and to provide readers with hints about more efficient searching. Part 1 covers the background and basic searching.

Where did PubMed come from? The history of PubMed/MEDLINE can be traced back to the end of the American civil war when Dr John Billings, a field surgeon in the Union Army, was offered a job at the Surgeon General’s Office in 1865. In addition to other duties, he took charge of the office library which then comprised about 1800 books. Recalling a time consuming and disappointing search for literature on epilepsy for his graduation thesis in 1860, he decided to establish a comprehensive medical library for American physicians and to prepare a comprehensive catalogue and index of the medical literature.7 The leadership of Billings was supplemented by the organisational skills of Dr Robert Fletcher

who joined the library in 1876 to pursue the idea of indexing the medical literature. By 1876 the Library of the Surgeon General’s Office was the largest medical library in the US, containing more than 50,000 books, journals and pamphlets. Dr Billings started to prepare an index of the whole holding of the library. This was the ‘‘Index Catalogue of the Library of the Surgeon General’s Office’’, commonly referred to as the Index-Catalogue. The process of preparing the Index-Catalogue, typesetting and proof reading it, was entirely manual and thus quite time consuming. That was why the first series of the Index-Catalogue, which required 16 volumes, took more than 15 years to complete: the first volume was published in 1880 and the last one in 1895. In total, five series of the Index-Catalogue were published from 1880 until 1961. When it became obvious that publishing the whole Index-Catalogue would take several years, Dr Billings decided to publish an index of recent journals, as they arrived in the library, on a monthly basis. This was called ‘‘Index Medicus: a Monthly Classified Record of the Current Medical Literature of the World’’, commonly referred to as Index Medicus. The first issue of Index Medicus was published in 1879 by a New York publisher, in the absence of funding from the US Congress. This index helped clinicians and researchers to find recent articles concerning a specific subject. To locate the journal articles related to Cholera, for example, a researcher could search for Cholera in relevant volumes of Index Medicus. The same search in the Index-Catalogue would produce all the books, dissertations, pamphlets, reports and articles that could be found in the Library from the oldest journal to the most recent ones. Due to financial and logistical constraints, Index Medicus could not be published regularly, and in 1927 it was merged with a similar quarterly publication of the American Medical Association (AMA) and named the Quarterly Cumulative Index Medicus. It was then published quarterly until 1960. Another publication named

Figure 1. MEDLINE is the principal database of PubMed.

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

Fatehi et al.

481

The Cumulated Index Medicus that was published at the end of each year included the contents of all the monthly issues of the Index Medicus for that year. So to undertake a comprehensive search in say July of a particular year, required the researcher to look up the topic in the six volumes of Index Medicus for that year (January – June), and also the volumes of the Cumulated Index Medicus for the years before that. Starting in January 1964, Index Medicus was published each month using a mechanised system (based on IBM punched card processing machines and the Eastman Kodak Listomatic camera). This was replaced by a computerised system called MEDLARS in 1964. It then continued until 2004, when publication ceased.8 The timeline is summarised in Table 1.

MEDLARS In 1956, the US Congress transferred the control of the library from the Army to the Public Health Service and established the National Library of Medicine (NLM). Taking advantage of developments in electronic data processing, the newly formed NLM introduced mechanised indexing methods in a project called the Medical Literature Analysis and Retrieval System (MEDLARS).7 This meant that from 1964, the large database of medical literature could be searched by a computer. However, searching the bibliographic database of

Table 1. Timeline of main events. Date

Event

1836

Library of the Surgeon General’s Office (LSGO) established First issue of Index Medicus published (until 1926) Quarterly Cumulative Index (QCI) begins to be published by the American Medical Association (AMA) Index Medicus (published by LSGO) merged with QCI (published by AMA) and Quarterly Cumulative Index Medicus published jointly by AMA and LSGO LSGO transformed into National Library of Medicine (NLM) MEDLARS established Index Medicus resumed monthly publication by NLM First edition of Medical Subject Headings (MeSH) published by NLM First MEDLARS search centre outside the NLM established at UCLA British MEDLARS started (the first MEDLARS centre outside the US) MEDLINE PubMed launched Publication of Index Medicus ceased

1879 1916

1927

1956 1958 1960 1960 1964 1966 1971 1997 2004

MEDLARS, was not a simple and straightforward task. Specially trained librarians at the NLM were needed to translate each search request into machine code, punch the necessary cards, compile a batch of searches and then feed them into the computer system. The computer then searched the bibliographic database which was stored on magnetic tapes and retrieved the relevant records to be printed and delivered to the requester via parcel post. The turnaround time between submitting a search request and receiving the results was 4–6 weeks. Refining a search therefore required additional weeks of waiting time. While many users were disappointed with the results of MEDLARS searching in terms of low precision (irrelevant citations) and poor recall (missing citations), it was certainly less laborious than hand searching the bound volumes of Index Medicus. A typical machine search using MEDLARS cost 14 in the initial years of its operation in the UK, equivalent to about 200 at 2013 prices.9 This made manual searching uneconomic.

MEDLINE To make it possible for individual libraries to perform bibliographic searches in real time, the NLM introduced MEDLARS onLINE (MEDLINE) in 1971. The telecommunication network used for this purpose was the commercial Tymshare network, which provided backbone communications to cities in the US and Europe at that time. The retrieval software, named ELHILL, was capable of searching different fields of the information records, including title, author, medical subject heading and publication date. The database that was used by MEDLARS comprised over 1.5 million records of medical literature, which made it one of the largest machine-readable databases in the world. However, except in rare instances, the search was still mediated by a third party. Researchers needed to explain what they were looking for to a trained librarian, who translated the request into a proper search strategy (using a combination of subject headings and text words) and performed the search using terminals connected to the NLM computer. Connect time was too expensive to waste, so the searchers consulted the printed volumes of search terms first, to prepare the search strategy before going online. For the first time, it was possible to use the immediate feedback from the system to refine the search (Figure 2). A maximum of 25 result hits with full bibliographic details could be typed out. More results were available on request via the off-line print service at the NLM and were sent out by post.10 After two years of providing the service for free, the NLM introduced charges for using MEDLINE with the aim of controlling the rapid growth in its usage.11 The initial cost of using MEDLINE was $6 per terminal connect hour and 10c per page for off-line printing (equivalent to $30/terminal connect hour, and ¢50/page for offline prints in 2012). The rate was increased to $12/hour in 1976.

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

482

Journal of Telemedicine and Telecare 19(8) While MEDLINE generally covers material published from 1946 onwards, the oldest PubMed records date back to 1920. Other online search engines like OVID SP and Embase provide access to MEDLINE as well. However, they will not produce identical search results because they employ different search algorithms.

How do I perform a simple search?

Figure 2. A computer terminal with built in acoustic coupler (Texas Instruments Silent 700) that was used for MEDLINE searches in 1982. The search results were printed onto a roll of heat-sensitive paper. The connection speed was typically 900 bit/s. (Photo credit: Cornell University Library).

MEDLINE on CD When Compact Disks (CDs) became available for computer data storage, it became possible to make MEDLINE accessible locally. The first software for MEDLINE using CDs was developed by Cambridge Scientific Abstracts (now a division of ProQuest).12 Similar products were then introduced by other companies, including Silverplatter (using the SPIRS interface) and CD Plus Technologies (using the OVID interface). To keep up to date with the new records added to the database, institutions received monthly or quarterly updates of MEDLINE on CD. Although the underlying database was identical, different interfaces offered different functionalities and features, thus producing different sets of results for a given search query. 13 By subscribing to one of these MEDLINE-on-CD products, universities and institutions eliminated the costs and technical difficulties of using MEDLINE remotely.

The PubMed homepage (http://www.ncbi.nlm.nih.gov/ pubmed/; also accessible from http://wwwpubmed.com and http://www.pubmed.gov) provides a simple search box and some hyperlinks to tutorials, tools and other resources (Figure 3). To perform a basic search, you can simply enter the search term(s) or the concept that you are looking for in the search box. Then press the Enter key or click on the Search button. While you are typing the search term into the search box, an autocomplete feature will suggest popular terms: these appear as a list so that you can select one of them instead of typing the whole phrase (Figure 4). This feature can be turned off if not required. Before performing a search, you should think carefully about the key concepts and terms pertaining to your research question. A small amount of time spent in the better identification of your key concepts, may save much time later on. There are millions of articles in PubMed, and its search algorithm has been designed to retrieve as many relevant articles as possible, so you should aim to be as specific as possible when choosing your search terms. For example, a search for ‘telemedicine’ will produce over 16,000 hits spread over 800 pages of results, and the most relevant results will not necessarily appear at the top of the list. This is not likely to be very helpful. Searching with PubMed, like other online search engines, is an iterative process. You develop your search strategy, launch the search, check the results, and if needed, modify the search strategy and launch another search to retrieve the results that answer your research question.

What does the results page show? PubMed Following the development of the Internet and the World Wide Web, the NLM made access to MEDLINE freely available to the public in June 1997. The service was called PubMed (Public MEDLINE) and grew out of the Entrez project at the National Centre for Biotechnology Information (NCBI) which is part of the NLM. In fact the PubMed website offers the option of searching any of the 43 databases which are maintained by the NCBI, most of which are related to molecular biology.14 MEDLINE, which is the principal database of PubMed, currently indexes approximately 5600 journals in 39 languages.15 In addition to MEDLINE, PubMed searches records from OLDMEDLINE, some additional life science journals, NIH-funded research manuscripts and a series of online books available on the NCBI Bookshelf.

Once the search terms have been entered and the search command has been issued (by pressing the Enter key or clicking on the Search button), PubMed will search the underlying databases, retrieve the information and display the resulting records. By default, the results are sorted by the date added to PubMed and displayed in summary format with 20 result hits (records) on each page. In summary format, the title of the article, list of authors, source of information (e.g. journal name followed by date of publication, volume, issue, pages) and the unique PubMed record number called the PubMed identifier (PMID) are shown. The search terms, as well as the terms that PubMed may add to the search query to enhance the search result, are shown in bold to help users locate them in various fields of each record (Figure 5-A).

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

Fatehi et al.

483

Figure 3. PubMed home page.

Figure 4. The Autocomplete feature.

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

484

Journal of Telemedicine and Telecare 19(8)

Figure 5. The results screen. Note that the example searches in this article were carried out in September 2013 and the number of results produced will change with time as more records are added to the databases. There may also be further developments of the PubMed interface which will alter the results produced by the example searches.

For articles in a language other than English, the title is translated into English and placed within brackets. In that case, the original language is indicated as an additional notation. Additional notations such as Publication type or

‘‘No abstract available’’ will be displayed if applicable. The journal name for each record is displayed in abbreviated form. This abbreviation is from the standard abbreviations of journal names that have been adopted by most

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

Fatehi et al.

485

Figure 6. A PubMed record in Abstract view. Author names are hyperlinked and links to related information in other NCBI databases are provided.

biomedical journals. By placing the mouse over the abbreviated journal name, the full name of the journal is shown. The results page also provides the user with tools and options to limit the results, modify the search, or launch a modified search based on the current search. On the lefthand side of the page are the most popular filters that can be used to limit the results based on various criteria such as Article types and Publication date (Figure 5-B). On the right-hand side of the results page is the Discovery Column (Figure 5-C). If the number of results is big enough, a histogram showing the trend of articles during the past years is shown at the top right of the page. The number of articles in each year is shown when the mouse goes over each bar in the histogram and the data can be downloaded in CSV format. Other sections that may appear in the Discovery Column include PMC Image search, articles with the search term in their title, Articles with free full text available, and suggested searches. Each section in this column can be collapsed by clicking on the Show/hide content button (a small square with a triangle in it). You can find information related to your search in other NCBI databases by using the dropdown list Find related in the Discovery Column. On the right-hand side of the screen, there is also a section called Search details (Figure 5-D). This shows how PubMed has translated the search query. This is a very important piece of information for debugging a search query if it produces an unexpected number of results or non-relevant results.

By clicking on the title of a record, the full record information is displayed in Abstract format (Figure 6). This shows the abstract of the article, along with the affiliation and address of the first author, link to full-text, links to Related Citations, and Links to external sources. You can easily search for additional publications by each author just by clicking on the author names (Figure 6-A). For articles whose full-text is freely available through PubMed, a preview of images in the paper is also shown. If a PubMed search produces a single result, that result will be shown in Abstract format by default, instead of Summary. Links to related information for each record are shown, if available, in the right side bar in Abstract view (Figure 6-B). A study on the search pattern of PubMed users showed that 80% of clicks for viewing the abstract occurred in the first 20 results that are shown on the first page.1 Also it has been shown that most people rarely examine the results beyond the first 20 records.16

How do I obtain the full text of an article? Although information is stored about the articles, PubMed/MEDLINE does not store the full text of the papers themselves. It can sometimes provide a link to the full text, but this may require a subscription to the journal concerned, either individual or institutional. Alternatively, the full text of a paper may be available for free, e.g. if it has been published under an Open Access agreement. Other arrangements that can affect the availability of full text articles to the users can be

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

486

Journal of Telemedicine and Telecare 19(8)

Table 2. The 10 most commonly-used PubMed search field tags and their abbreviations. Field Tag

Abbreviation

[Author] [Date of Publication] [Journal] [Pagination] [Volume] [Issue] [MeSH Terms] [Language] [Title] [Title/Abstract]

[AU] [DP] [JO] [PG] [VI] [IP] [MH] [LA] [TI] [TIAB]

invisible to the users. For example, an article that is available in full text from a University office may not be available to the same user when connected from home. This will be determined by the Internet address of the device from which the user is visiting the PubMed website. Currently, PubMedCentral (PMC), a service of the NCBI, stores more than 2.8 million articles (roughly 10% of the articles indexed in the MEDLINE database) and provides access to them for free to the users. If an article is stored in PMC, a link to the full text will appear when its abstract is displayed in PubMed. The decision is made by each journal or publisher whether or not to deposit their contents in PMC and thus provide free access to their articles from the PMC website. At present, more than 1300 journals are full participants of PMC and more than 2200 other journals deposit selected articles in PMC. Access to PMC is available on the Internet via the following address: http://www.ncbi.nlm.nih.gov/pmc/.

How do I search in specific fields? MEDLINE is a structured database, which means that information about the articles is stored in a systematic form. Each database record contains data elements or fields such as title, author and publication date. There are more than 65 fields in the database. Some of the fields are optional and therefore not displayed for every record (e.g. grant number). A detailed description of PubMed/MEDLINE fields is available from http:// www.nlm.nih.gov/bsd/mms/medlineelements.html. Using PubMed you can search in 38 fields using the advanced search page (see part 2). Alternatively, you can specify which field you want to search using field

tags, see Table 2. Each search term can be tagged with the field name or its abbreviation enclosed in square brackets (e.g. Jackson[Author]). Although using field tags in a search query is important for optimum information retrieval in PubMed, in practice very few users include field tags in their search query.6 References 1. Islamaj Dogan R, Murray GC, Ne´ve´ol A, Lu Z. Understanding PubMed user search behavior through log analysis. Database (Oxford) 2009;2009:bap018. 2. Ely JW, Osheroff JA, Chambliss ML, Ebell MH, Rosenbaum ME. Answering physicians’ clinical questions: obstacles and potential solutions. J Am Med Inform Assoc 2005;12:217–24. 3. Hoogendam A, Stalenhoef AF, Robbe´ PF, Overbeke AJ. Analysis of queries sent to PubMed at the point of care: Observation of search behaviour in a medical teaching hospital. BMC Med Inform Decis Mak 2008;8:42. 4. Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford) 2011;2011: baq036. 5. Giglia E, Spinelli O. PubMed reloaded: new interface, enhanced discovery. Eur J Phys Rehabil Med 2009;45:631–36. 6. Mosa AS, Yoo I. A study on PubMed search tag usage pattern: association rule mining of a full-day PubMed query log. BMC Med Inform Decis Mak 2013;13:8. 7. Kunz J. Index Medicus. A century of medical citation. JAMA 1979;241:387–90. 8. NLM. FAQ: Index Medicus Chronology. See http:// www.nlm.nih.gov/services/indexmedicus.html (last checked 30 September 2013). 9. [No authors listed] First thoughts on MEDLARS. Lancet 1969;7599:818–19. 10. [No authors listed] MEDLINE. Lancet 1973;7804:650. 11. McCarn DB. MEDLINE users, usage and economics. Med Inform (Lond) 1978;3:177–83. 12. Capodagli JA, Mardikian J, Uva PA. MEDLINE on compact disc: end-user searching on Compact Cambridge. Bull Med Libr Assoc 1988;76:181–3. 13. Schoonbaert D. SPIRS, WinSPIRS, and OVID: a comparison of three MEDLINE-on-CD-ROM interfaces. Bull Med Libr Assoc 1966;84:63–70. 14. NLM. NCBI News August 1997. See http://www.ncbi.nlm. nih.gov/Web/Newsltr/aug97.html (last checked 1 October 2013). 15. NLM. Fact sheet MEDLINE. See http://www.nlm.nih.gov/ pubs/factsheets/medline.html (last checked 30 September 2013). 16. Eysenbach G, Powell J, Kuss O, Sa ER. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA 2002;287:2691–700.

Downloaded from jtt.sagepub.com at UNIVERSITAETBIBLIOTHEK on May 11, 2014

MEDLINE searches: 1. background and basic searching.

PubMed provides free access via the Internet to more than 23 million records, of which over 19 million are from the MEDLINE database of journal articl...
899KB Sizes 0 Downloads 0 Views