INSIGHTS Downloaded from http://science.sciencemag.org/ on August 21, 2017

LET TERS

Artificial intelligence in research We asked young scientists to describe an example of artificial intelligence or machine learning in research, its broader implications in the field, and the challenges scientists face when using such technologies. Our survey’s responses reflected a variety of countries and fields, but only 6% came from women (compared to the typical 30 to 40%). Excerpts of some of the responses we received are printed below. —Jennifer Sills

Surgical robots are gradually being adopted to perform complicated surgical interventions, including minimally invasive and surgeon-less surgeries. The implementation of robotic surgery amplifies the efects of automation, allowing work around the clock with higher productivity, accuracy, and efciency as well as shorter hospital stays and faster recovery. Because surgeons no longer have to perform the whole surgery, they do not get 28

tired as easily and may perform multiple procedures, thus decreasing patient waiting periods. These novel techniques and machines may enhance patient outcomes. The biggest concern is how to address (or even prevent) technical difficulties in the midst of a surgery. Ethical concerns and loss of relevance of surgeons need to be addressed as well. As such machines perform surgeries that require cognitive

and decision-making abilities that have social, moral, and clinical consequences, programmers are faced with an added layer of quandary: how to equip artificial intelligence with tools to handle the inherent moral responsibility associated with such tasks. Moreover, can robots ever be as proficient as humans in performing surgeries? Who will take responsibility if a surgery fails due to poor judgment? These questions need to be addressed before we allow machines to perform the role of clinicians. Mrinal Musib Department of Biomedical Engineering, National University of Singapore, Singapore 129800, Singapore. Email: [email protected]

Machine-learning techniques and lightweight unmanned aerial vehicles are revolutionizing environment monitoring. In the past, investigating vegetation and wildlife status required extensive surveys of the area. Now, unmanned aerial vehicles can access hard-to-reach places and quickly capture vegetation types, area and wildlife count, and activity based on high-resolution images. The results derived from these new techniques are even more precise than estimates made by the traditional sciencemag.org SCIENCE

7 JULY 2017 • VOL 357 ISSUE 6346

Published by AAAS

PHOTO: MEDIA FOR MEDICAL/UIG VIA GETTY IMAGES

NEXTGEN VOICES

nents such as temperature, time, or yield, and struggle when trying to retroactively include such information. Strict ontologies aren’t always observed; do “warmed” and “heated” mean the same thing? Although it can feel oppressive and creativity-killing, standardization to templated reports and experiments will greatly improve our ability to learn from large data sets. Michael A. Tarselli NIBR Informatics (NX), Novartis Institutes for BioMedical Research, Cambridge, MA 02139, USA. Email: [email protected]

ground-based method, although comparing data derived from new techniques with historic data sets often presents a challenge. Feng Wang Institute of Desertification Studies, Chinese Academy of Forestry, Beijing, 100091, China. Email: [email protected]

A 2016 collaboration taught computers how to recognize experimental procedures, allowing the extraction of 1 million unique chemistry reactions from 150,000 patent documents spanning 40 years. A scientist could then observe how trends in reactions and properties of synthesized products changed. Generally, molecules in this data set became bigger, greasier, and flatter over time. Previously, studies were limited to far fewer documents, and reactions had to be extracted by eye and redrawn by hand. The biggest challenge of machine learning is cleaning the data. Most lab data today aren’t structured to permit straightforward data mining or machine learning. Most academic (and some industrial) labs still depend on paper lab notebooks, which then require translation into digital media. Scientists ofer incomplete pictures of experiments, leaving out critical compo-

Rachel Yoho CREATE for STEM Institute, Michigan State University, East Lansing, MI 48824, USA. Email: [email protected]

Well-designed machine-learning algorithms are useful for predicting cancer patients’ prognoses. However, to make clinical prediction models work well, researchers need to fine-tune the algorithms’ parameters. In the past, choosing the optimal parameters required trial and error, biomedical domain knowledge, and ingenuity in model design. With the recent advancement of automated

SCIENCE sciencemag.org

Kun-Hsing Yu Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA. Email: [email protected]

Plant biologists use software based on machine learning to predict protein-protein interactions and to find out how proteins interact, knowledge we can use to improve crop yield during drought conditions. One potential problem with machine learning is the possibility that everything will be automated, meaning that some lab positions will no longer be needed. Rigoberto Medina Andrés Instituto de Investigación en Ciencias Básicas y Aplicadas, Universidad Autónoma del Estado de Morelos, Cuernavaca, Morelos 62209, Mexico. Email: [email protected]

A typical next-generation sequencing experiment to catalog the mutations present in a group of patient cancer samples produces terabytes of data. The goal of such studies is to identify those mutations that play an important role in tumor biology. However, separating out the signal from the noise is extremely challenging with such large data sets. Machine-learning techniques have allowed researchers to identify patterns within these data in order to systematically remove the noise. This increased accuracy means we can be more confident in our findings. The black-box nature of many machinelearning packages makes errors more challenging to identify. Scientists must be especially vigilant that they are using high-quality, well-annotated data when training machine-learning models. Noah F. Greenwald Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA. Email: [email protected] 7 JULY 2017 • VOL 357 ISSUE 6346

Published by AAAS

29

Downloaded from http://science.sciencemag.org/ on August 21, 2017

Surgical robots could improve patient outcomes, but they also raise a host of ethical issues.

Machine learning has recently been applied in undergraduate science education to analyze student writing. In “constructed response” questions, students respond in their own words to a prompt. A test data set of student responses is evaluated by a group of experts using rubrics designed to characterize student understanding and reveal potential misunderstandings. The expert evaluations are then used to train machine-learning models to predict how experts would rate new student responses. Using machine learning in the classroom allows instructors to understand how their students are thinking about scientific concepts based on their written thoughts and ideas. Importantly, these educational techniques can be scaled up to large classrooms. Instead of spending hours reading through student writing, instructors can gain information about their students in a matter of minutes. Other than calibrating rubrics to efectively predict expert evaluations, one of the main challenges is the integration into the classroom for educational and research purposes. Often, individuals are skeptical about the abilities of computers to characterize student thinking about complex topics. However, agreement between the computer and expert scoring is similar to the agreement between human experts. Increased research and communication of the method will likely increase the uptake and use of this scalable educational resource to gather previously timeprohibitive information on student thinking in the classroom.

parameter optimization, we can delegate the model fine-tuning process to another set of machine-learning algorithms, which has facilitated the development of precision medicine. Machine learning is not immune from the “garbage in, garbage out” rule. The generalizability of the machine-learning models depends on the representativeness of the training data. Models built on distorted training data can result in inaccurate assessments and perpetuate biases. One recent report showed that a beauty contest judged by an experimental machine-learning system discriminated against people with dark skin, simply because the developers did not include enough minorities when building the system. Similarly, even if certain demographic variables yield efective clinical predictions, we need to be careful when evaluating their real-world applicability and implications.

INSIGHTS | L E T T E R S

Xubin Pan Institute of Plant Quarantine, Chinese Academy of Inspection and Quarantine, Chaoyang, Beijing 100029, China. Email: [email protected]

Chien-Hsiu Lee Subaru Telescope, National Astronomical Observatory of Japan, Hilo, HI 96720, USA. Email: [email protected]

Based on intellectualized unmanned aerial vehicle technology, fire safety researchers have developed unmanned aerial vehicles to obtain real-time images and monitoring data of fire scenes and then make timely decisions about emergency rescue. With the help of unmanned aerial vehicles, firefighters can overcome bad weather conditions, and researchers can observe the dynamic changes of fire behavior to guarantee the safety of firefighters’ lives. The stakes of malfunction in these machines are high. Inaccurate data could prompt scientists to put firefighters in danger, and hackers could expose privacy risks. Jian Zhang School of Safety and Environmental Engineering, Hunan Institute of Technology, Hengyang, Hunan 421002, China. Email: [email protected]

Determining the diference between benign and malignant lesions is a constant struggle for medical doctors worldwide. Earlier this year, Stanford University researchers developed a deep-learning algorithm that could classify skin cancers as accurately as a panel of board-certified dermatologists. Such automated classification has the potential to improve the consistency, sensitivity, and specificity of precise lesion categorization. Moreover, smartphone applications using this algorithm could result in low-cost, universal access to vital diagnostic care. If developed into a smartphone application, a looming challenge for the field will

be the accurate use of the technology by the end user. Insufcient photography or tracking lesions at infrequent intervals could result in the misdiagnosis of malignant skin cancers. Dealing with liability issues and the potential public health outcomes will need to be closely considered moving forward. Ken Dutton-Regester QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia. Email: [email protected]

The ability of artificial intelligence to interpret clinical data has the potential to revolutionize the clinical trial recruitment process. There are many considerations that factor into clinical trial recruitment, and matching an ill patient to the optimal trial is a formidable challenge. It is equally challenging for clinical studies to recruit their desired number of trial participants. Artificial intelligence can bridge this gap, analyzing mass data from medical records and clinical trials to facilitate participant recruitment. An important risk to consider when using artificial intelligence in clinical research is the shift of influence, power, and accountability from physician to machine. As clinical artificial intelligence is developed further, patients may begin to heed advice from machines over physicians. This could have negative implications: Artificial intelligence technology is based on mathematic algorithms that do not have a physician’s ability to see the big picture or take into consideration less quantifiable factors that affect a patient’s health, such as lifestyle or diet. Jake Wyatt Johnston Department of Surgery, Vancouver General Hospital, Vancouver, BC V5Z 1M9, Canada. Email: [email protected]

Developing decision-tree algorithms can help us detect a more complex mixture of gases more efciently, an ability that could be applied to environmental air monitoring and oil well investigations. One risk when using decision trees is that a miscalculation at the beginning of the tree will afect any outcome built on the first decision. Decision trees require a lot of parameters to be efective, and each of these parameters needs to be optimized. However, overcomplicating the decision-tree algorithm will lead to overfitting that will not allow the determination of general trends. Icell Mahmoud Sharafeldin

Unmanned aerial vehicles are helping scientists access remote locations and monitor fires.

30

Environmental Engineering, The American University in Cairo, Cairo 11835, Egypt. Email: [email protected]

sciencemag.org SCIENCE

7 JULY 2017 • VOL 357 ISSUE 6346

Published by AAAS

Downloaded from http://science.sciencemag.org/ on August 21, 2017

In astronomy, machine learning is pivotal in time-domain surveys. A good example is the Palomar Transient Factory (PTF), which uses a wide-field camera on a small telescope for sky surveys, feeding intriguing objects to another dedicated telescope to follow up. In time-domain studies, every second counts, but it is difcult to maintain dedicated researchers every hour of every day to examine all intriguing objects and make real-time follow-up decisions. Using machine learning, PTF can identify intriguing transients in the big data and trigger follow-up on the same night, leading to tremendous scientific returns. The main challenge of machine learning is a good training data set. As survey designs differ substantially, machine learning needs to be fine-tuned to suit each individual survey. Therefore, human classifications are indispensable in the beginning of a survey. Luckily, citizen scientists provide great help in tackling such issues and delivering a good training

data set in a timely manner. Some fear that machine learning might replace their jobs. It’s quite on the contrary: Machine learning frees us from mundane routines, helping us concentrate on conducting research, writing papers, and presenting results.

PHOTO: MATT CHRISTENSON/BLM/2017

Accurate and quick species identification is important in ecology, ecosystem conservation, and public education. However, most people, including citizen scientists, do not have enough knowledge about species morphology to make a determination. With support from artificial intelligence and other data-collecting tools, people can use their computer or mobile device to accurately identify a species. Artificial intelligence systems can also serve as a real-time discovery tool for invasive or endangered species. Such applications are constrained by the quality of the input image or sound, and usually do not work when species need to be determined through genetic analysis.

Artificial intelligence in research Mrinal Musib, Feng Wang, Michael A. Tarselli, Rachel Yoho, Kun-Hsing Yu, Rigoberto Medina Andrés, Noah F. Greenwald, Xubin Pan, Chien-Hsiu Lee, Jian Zhang, Ken Dutton-Regester, Jake Wyatt Johnston and Icell Mahmoud Sharafeldin

Science 357 (6346), 28-30. DOI: 10.1126/science.357.6346.28

http://science.sciencemag.org/content/357/6346/28

PERMISSIONS

http://www.sciencemag.org/help/reprints-and-permissions

Use of this article is subject to the Terms of Service Science (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. 2017 © The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. The title Science is a registered trademark of AAAS.

Downloaded from http://science.sciencemag.org/ on August 21, 2017

ARTICLE TOOLS

Artificial intelligence in research.

Artificial intelligence in research. - PDF Download Free
1MB Sizes 1 Downloads 14 Views