TOXICOLOGICAL SCIENCES, 143(1), 2015, 3–5 doi: 10.1093/toxsci/kfu237 EDITORIAL

EDITORIAL

Data Sharing in Toxicology: Beyond Show and Tell Gary W. Miller1 1

For correspondence via fax: (404) 727-9853. E-mail: [email protected]

If authors can’t figure out how to share data in meaningful ways then journals shouldn’t publish their work.

BACK TO SCHOOL In the past several months I have participated in numerous discussions about data sharing. It reminded me of kindergarten. The teacher would ask the students to bring in objects for “show and tell.” I recall proudly showing some artifact of childhood to my peers. I would beam as my friends acknowledge the treasure I beheld with “oohs” and “ahhs.” Yet, minutes later when a friend asks to hold it or play I would reflexively wince as a wave of unease washed over me. What happened? I had something very exciting to share with my friends. Why did I bristle at the logical step of letting my friends experience what I had? I suspect that the visceral resistance to sharing may not be such a bad thing. Those feelings emanate from the strong connection with the object. When one has cared for and nurtured an item, one feels an obligation to protect it. In kindergarten, we were encouraged to overcome these feelings. In fact, when possessions are shared with like-minded friends the experience around that object can grow and lead to new avenues of play and interaction—hobbies, clubs, etc. My teacher was correct. It is good to share.

MODERN DAY SHOW AND TELL As scientists we continue to participate in a form of “show and tell.” We do this in several different venues. From group laboratory meetings, to regional or national conferences, to grant applications, to manuscripts, we show the results of our studies and tell the story to others. Of course the published manuscript is the standard that is deposited into the archives of science. Nobel prizes are not awarded for presenting posters or discussing good ideas in the break room. Nor do major advances occur in the field of toxicology without publishing in respected outlets. This is why the Society of Toxicology sponsors Toxicological Sciences. It is essential to provide a high quality forum for the dissemination, that is, sharing, of our research findings in the form of peer-reviewed papers.

Publishing work in scientific journals is certainly one way to share data, but there is more to it. Authors are expected to share reagents and tools to allow for replication of work and to allow others to pursue new lines of research. With the ever-expanding data-intensive approaches, the data itself become the reagent. What does it mean to share the reagent we call data? Merely depositing massive data sets in an online archive is analogous to sending a cDNA construct without a map or restriction sites. Yes, you shared the material, but you made it really difficult for the person to actually use it. Appropriate sharing of a cDNA construct involves providing the validated material with a detailed map that includes information on antibiotic resistance, promoters, and the multiple cloning sites. Similarly, appropriate sharing of an antibody includes information on the material used for immunization, working titers, any special conditions needed, and a blot showing the specificity. If we take such steps to share simple reagents, why then do we not demand a similar, if not greater level of detail for our data?

SHARING DATA VERSUS DATA SHARING When authors submit manuscripts to a scientific journal they are all doing the same thing—sharing the data from their scientific pursuits. If sharing data is at the heart of the scientific process, why does switching the order of the two words to make data sharing make it such a contentious topic? It comes down to a matter of degree. No scientist would expect a scientific paper to be published without providing data. They willingly provide their data as part of the submitted manuscript, but it is shared in a very controlled manner. The authors conduct the experiments, analyze the data, and present only the final figures or tables. The raw images, smeared blots, and failed attempt are almost never seen outside of the laboratory. I want to see the beautiful figures that depict the results of an expertly designed experiment, but I also want to see the rigorous experimental controls and appropriate statistical analysis that I described in Improving Reproducibility in Toxicology (Miller, 2014). When appropriate, I also want to have access to the data and reagents so that experiments can be reproduced. In the modern day

C The Author 2014. Published by Oxford University Press on behalf of the Society of Toxicology. V

All rights reserved. For Permissions, please e-mail: [email protected]

3

Downloaded from http://toxsci.oxfordjournals.org/ at University of Otago Science Library on July 2, 2015

Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA 30322.

4

|

TOXICOLOGICAL SCIENCES, 2015, Vol. 143, No. 1

vernacular, the term data sharing refers to the practice in which original data is made publically available and useable to others, typically without specific permission or consultation with the originator. The complete relegation of the data feels like giving up that toy in kindergarten.

TIME TO GROW UP

GROWING PAINS One of the current challenges is that many of the database structures used for data sharing are funded by grants, which have an end date. The concept of data evaporation is somewhat frightening. A significant amount of data just disappears into

Downloaded from http://toxsci.oxfordjournals.org/ at University of Otago Science Library on July 2, 2015

The kindergarten days are long gone. We have had to move on in adulthood with more responsibilities and fewer naps and recesses. Toxicology has also grown up. Our tools have changed. We now have transcriptomics, proteomics, lipidomics, twophoton microscopy, computational toxicology, and systems biology. Just because these new approaches generate massive amounts of data doesn’t absolve the investigator from providing access to their data. In fact, when we rely on extraordinarily large data sets it becomes even more important to provide the field with access to the data so that the work can be checked. We have developed an intuitive sense of a figure that does not look just right and requires examination of the original blot, but the computations that underlie the—omic type data defy intuition. A reviewer cannot sense inappropriate analysis or even impropriety as readily as can be done with traditional laboratory approaches. The early studies in toxicogenomics were met with great excitement. They were also met with incomplete expectation of reproducibility and sharing and much of it ended up being rubbish. Now that appropriate platforms for analyzing, storing, and sharing these data are available, this subdiscpline can make meaningful contributions. The Comparative Toxicogenomics Database (http://ctdbase.org), funded by the National Institute of Environmental Health Sciences, and the Data Infrastructure for Chemical Safety (DiXa, http://www.dixa-fp7.eu), hosted by the European Bioinformatics Institute at the European Molecular Biology Laboratory, are examples of efforts to provide platforms for data exchange and analysis. The goal of these initiatives is to archive the data in a fashion that allowed users to query and reanalyze data. The results of toxicological studies can significantly affect human and economic health by helping craft policies and regulations that impact major industries. Our data must be of the highest quality, which requires us to submit our data to a high level of scrutiny from the general scientific community. This requires some level of data sharing. As we have in the past Toxicological Sciences will continue to require access to the data necessary to determine whether or not the experiments and conclusions are sound. Currently, this is performed on an “as needed” basis. The editorial team at Toxicological Sciences is not ready to institute a mandatory system for data sharing and may never do so. The needed infrastructure for comprehensive data sharing is lagging far behind the actual data. What I would like to see in the upcoming years is that funders require investigators to deposit data into a preapproved data archive and the journal could help enforce such requirements. For the time, Toxicological Sciences will rely on supplemental data to provide critical data needed to fully interpret the conclusions in the paper.

the ether as these programs and systems lose funding or are retired due to obsolescence. Thus, while the information may be available at the time of publication, there is a significant risk of it evaporating over time. This is a serious issue that needs to be addressed. In the meantime, even with its fixed nature, supplemental data should be as robust and permanent as the published manuscript itself. If authors are not prepared to help develop and maintain data sharing repositories and systems, then they should not dabble in these sophisticated approaches. I have long argued that the growth industry in toxicology is informatics (not unlike all of biological and biomedical sciences). This is not to say that the mechanistic and regulatory toxicologists are going to be phased out, but that the data generated and interpreted by these stalwarts of the field will require assistance from bioinformaticists. One significant challenge we face with data sharing in toxicology is that we do not agree on how to describe chemicals and exposures. Although such ontologies are in place in genomics and many other biological domains, environmental health sciences and toxicology lack such a system. Efforts are underway to develop an environmental health sciences language standard that addresses this concern. Such a system would facilitate meta-tagging (the process of adding behind-the-scene descriptors or codes that allow systematic and automated organization of data on web platforms or other computation-based systems) and sharing of data in toxicology. Creating a pile of data that cannot be interrogated and exchanged is not helpful. I still feel that pang of kindergarten angst when I am asked to share data. I do not hesitate to share the data in the form of a manuscript, in which I control the analysis and presentation of the data. In fact, I love this form of show and tell. The idea of having other investigators reuse or manipulate my data can be slightly disconcerting. However, with proper systems that include quality control, transparency, and appropriation of credit, these concerns dissipate. It would be great for authors of papers in Toxicological Sciences to know that their data would be systematically meta-tagged and incorporated into a large sharable database once the work was published. Knowing that data will be part of a grander scheme may make authors focus more on the quality and integrity that I discussed in Improving Reproducibility in Toxicology (Miller, 2014). Providing more detail about the experimental design and analysis makes reproducibility more likely and is especially true for big data-based approaches. Publishing in Toxicological Sciences requires more than “show and tell.” We expect authors to make their prized data available in a manner not unlike what we have been encouraged to do since childhood. In addition to developing the right tools, we must also commit to being transparent with our data. Ultimately, this may require certain data standards that permit meta-tagging and archiving. I challenge our readers to think about what we can do to improve our capacity and willingness to share data. The journal fully supports efforts by funding agencies to move towards systems that allow this type of access and sharing and we will help enforce these rules put forth by funding agencies. The editorial staff at Toxicological Sciences will continue to examine our policies and those of other publishers and funders. We plan to work with our readers and other stakeholders to assure that our journal is taking steps in the right direction. These efforts will help strengthen our field by proving a higher degree of integrity to our findings and extracting more information from the work we have completed through merging of data sets and data mining. For now, it is

EDITORIAL

time for me to head back to the playground that is my laboratory. Some things never change.

| 5

Ganey, Dr Matthew Campen, Dr Rory Conolly, and Dr Patti Miller.

ACKNOWLEDGMENTS The author would like to thank the following individuals for reading the manuscript and providing feedback: Dr Norbert Kaminski, Dr Peter Goering, Dr Douglas Keller, Dr Patricia

REFERENCE Miller, G. W. (2014). Improving reproducibility in toxicology. Toxicol. Sci. 139, 1–3.

Downloaded from http://toxsci.oxfordjournals.org/ at University of Otago Science Library on July 2, 2015

Data sharing in toxicology: beyond show and tell.

Data sharing in toxicology: beyond show and tell. - PDF Download Free
68KB Sizes 4 Downloads 5 Views