Available online at www.sciencedirect.com

ScienceDirect Viral metagenomics: are we missing the giants? S Halary, S Temmam, D Raoult and C Desnues Amoeba-infecting giant viruses are recently discovered viruses that have been isolated from diverse environments all around the world. In parallel to isolation efforts, metagenomics confirmed their worldwide distribution from a broad range of environmental and host-associated samples, including humans, depicting them as a major component of eukaryotic viruses in nature and a possible resident of the human/animal virome whose role is still unclear. Nevertheless, metagenomics data about amoeba-infecting giant viruses still remain scarce, mainly because of methodological limitations. Efforts should be pursued both at the metagenomic sample preparation level and on in silico analyses to better understand their roles in the environment and in human/animal health and disease. Address Unite´ de Recherche sur les Maladies Infectieuses Tropicales Emergentes (URMITE) UM63, CNRS 7278, IRD 198, INSERM 1095, Aix-Marseille Universite´, Marseille, France Corresponding author: Desnues, C ([email protected]) Current Opinion in Microbiology 2016, 31:34–43 This review comes from a themed issue on Special section: megaviromes Edited by Didier Raoult and Joˆnatas Abraha˜o For a complete overview see the Issue and the Editorial Available online 3rd February 2016 http://dx.doi.org/10.1016/j.mib.2016.01.005 1369-5274/# 2016 Elsevier Ltd. All rights reserved.

Metagenomics, the culture-independent studies of genomic sequences contained in a particular sample, has dramatically changed our knowledge of microbial communities since its beginnings in the late 1990s. Previously hampered by the fact that more than 99% of microorganisms evade cultivation by readily accessible procedures, the so-called ‘great plate count anomaly’, microbiologists now have access to a much more reliable and precise view of the functional genetic diversity, taxonomical composition and dynamics of microbial populations. The rise of next generation sequencing and the constant reduction of sequencing costs have made metagenomics a gold standard in microbial ecology. Adapted for the first time to viral communities in 2002 [1], metagenomics has also proven to be valuable for the study of viral diversity in environments, for the discovery of novel viral lineages, and as a diagnostic tool in plants, animal and human pathologies. Current Opinion in Microbiology 2016, 31:34–43

Giant viruses (GV) are large dsDNA viruses belonging to Nucleo-Cytoplasmic Large DNA viruses (NCLDV) and were recently grouped into the proposed Megavirales order [2]. Of the GV, amoeba-infecting giant viruses (AIGV) were first discovered in 2003 with the isolation of Acanthamoeba polyphaga Mimivirus in water samples from a hospital cooling tower. AIGVs’ size (>250 nm) and genomic content (>350 kb coding for up to 2556 ORFs) made scientists severely question the virus paradigm that has prevailed since the 18th century. So far, almost 150 AIGVs have been isolated in amoeba and characterized (Figure 1), forming six distinct, though phylogenetically related, families or genera: Mimiviridae, Marseilleviridae, Pandoraviridae, Pithovirus, Mollivirus and Faustovirus. Although suspected to play a key role in nature and in some animal and human pathologies, still little is known about these giants. In this review, we aim to emphasize metagenomics’ contributions to the emerging contemporary field of AIGV research, and discuss the methodological challenges that AIGV studies continue to raise in metagenomics.

Amoeba-infecting giant viruses are ubiquitous in the environment It was not until 2008 and the development of high throughput techniques of culture and co-cultivation on new hosts that a second strain of AIGV (Acanthamoeba castellanii Mamavirus) was isolated from another cooling tower in Paris, France [3]. Since then, a growing number of AIGVs, displaying a wide range of particle morphologies and genome sizes, have been isolated (Figure 1, Table 1). AIGVs present, like their amoebal host, worldwide distribution and they have previously been isolated in Europe (France, UK, Germany, and Russia), Africa (Tunisia, Senegal, Lebanon and Saudi Arabia), North and South America (Brazil, Chile and USA) and Australia (Table 1). In addition, isolation of AIGVs was reported from various environmental samples, including cooling and fresh water, seawater and soil, where amoebas are natural inhabitants. For example, several new Mimiviridae strains and Marseilleviridae viruses, which were recently partitioned in three lineages, as well as the newly described Pandoraviruses and Pithovirus, were detected in environmental water (cooling towers, rivers and lakes, sewage, seawater, urban water, marine sediment and freshwater ponds) and extreme environments (Siberian permafrost). AIGV have also been isolated from non-amoebal eukaryotic hosts including leeches, oysters and arthropods, and in human-associated samples (stools, blood and bronchoalveolar lavage) (Table 1). Whether AIGVs www.sciencedirect.com

Viral metagenomics Halary et al. 35

directly infect eukaryotic host cells or, on the contrary, amoebas associated with the eukaryote organism, is currently unknown. However, it has been shown that AIGV members of the Mimiviridae and Marseilleviridae families, and also Phycodnaviridae which are giant viruses infecting algae, can successfully infect mammalian cells [4,5,6]. These results reinforce the hypothesis that AIGVs and other NCLDVs such as Phycodnaviridae could represent opportunistic pathogens mediated or not by amoebas (or by other unicellular protists).

advantage of large metagenomic projects to get an idea of GV presence and abundance in environments (see Table 2 for examples). The search for AIGV sequences, as DNA polymerase B or other NCLVD core genes in the Global Ocean Sampling (GOS) expedition and Sargasso Sea metagenomic project, revealed an unexpected diversity of ‘Mimivirus-related’ sequences in high numbers in many samples of surface water from East Pacific and West Atlantic oceans [7–10]. Later surveys of oceanographic expedition data (GOS and Tara Oceans) confirmed the ubiquity of Mimiviridae in seas and oceans around the world [11,12]. This global oceanic distribution likely indicates that a broader host range than previously thought exists for the Mimiviridae family members. Indeed, it has been

Abundance and diversity of AIGV in metagenomic datasets Given the difficulty of amoeba co-culture isolation techniques, early surveys of AIGV diversity took

Number of published items OR cumulative number of isolated AIGV strains

Figure 1

300

250

200

Cumulative number of AIGV isolated from 2003 to 2015

Number of viral metagenomic studies by year 150

100

Culture milestones: 2003: isolation of Acanthamoeba polyphaga Mimivirus, the first AIGV 2009: First Marseillevirus 2013: Firsts Pandoraviruses, largest viral genomes identified to date 2014: First Pithovirus, largest virion size 2015: firsts Faustoviruses and Mollivirus

9 4 23

1

2

Mollivirus Pithovirus Faustovirus Pandoraviridae Marseilleviridae

2

1 2

21

22

50

1 3

1 1

6

8

20

23

23

2010

2011

2012

5

49

49

2013

2014

Mimiviridae 110

Number of viral metagenomic studies citing AIGV by year

0 2002

2003

2004

2005

2006

2007

2008

2009

2015

Years Current Opinion in Microbiology

Number of viral metagenomic studies per year compared to those citing AIGV and cumulative number of AIGV isolates from 2002 to 2015. Number of studies of viral metagenomics (dashed line) per year from 2002 to November 2015 published in the PubMed database and searched by using (virus OR viral OR virome) AND (metagenom*) as keywords. Number of published metagenomic studies per year citing (NCLDV OR giant virus* OR megavirales OR mimivir*) in the full text and metagenom* in the Title/Abstract as keywords (solid line). The results have been manually curated to remove articles focusing on GVs other than AIGV and to include studies presenting results about AIGV in Figures/Tables or supplementary data. Cumulative number of isolated viruses from the Mimiviridae (red plot), Marseilleviridae (green plot), Pandoraviridae (pink plot), Faustovirus (purple plot), Pithovirus (lightblue plot) and Mollivirus (dark blue plot) families and genera. www.sciencedirect.com

Current Opinion in Microbiology 2016, 31:34–43

36 Special section: megaviromes

Table 1 Examples of amoeba-infecting giant viruses isolated from various environments and locations Viral group

Lineage

Isolate name

Location

Isolated from

Environmental sample Mimiviridae

A

B

C

Unassigned Marseilleviridae

A

B C Faustovirus

Mimivirus Samba virus Kroon virus OY RN27 BZ16 Mamavirus Terra2 Lentillevirus

UK Brazil Brazil Brazil Brazil France France France

Cooling tower River Urban lake — Sewage Cooling tower Soil —

Hirudovirus Moumouvirus Istres virus BZ49 Goulette virus Shan virus

France France France Brazil Tunisia Tunisia

— Cooling tower Soil Sewage Sea —

LBA111

Tunisia



Boug1

Tunisia

BZ23 Terra1 Courdo11 Megavirus chiliensis Cafeteria roenbergensis virus

Brazil France France Chile USA

Chott (hypersaline soil) Sewage Soil Rivers and lakes Sea Sea

Marseillevirus

France

Cooling tower

Cannes8 virus Melbourne virus Senegalvirus BZ1 Lausannevirus Tunisvirus Insectomime virus

France Australia Senegal Brazil France Tunisia Tunisia



Unassigned

Host-associated sample — — — Oysters — — — Keratitis-associated contact lens liquid Hirudo medicinalis leech — — — — Pneumonia-associated human stool Pneumonia-associated human BAL —

[1,2]* [3]* [4]* [5]* [6]* [7]* [8]* [9]* [10]* [11]* [12]* [6]* [12]* [13]* [14]* [12]*

— — — — —

[6]* [8]* [15]* [16]* [17,18]* [19–21]*, [5]

Cooling tower Freshwater pond — Sewage River Fountain —

human blood, human stool, adenitis-associated human lymph node — — Human stool — — — Eristalis tenax larvae

Lebanon Senegal France

Sewage Sewage Sewage

— Culicoides sp. —

[27]* [27], [28]* [27]*

Pandoravirus salinus

Chile



[29]*

Pandoravirus dulcis BZ81

Australia Brazil

— —

[29]* [6]*

Pandoravirus inopinatum

Germany

Superficial marine sediment layer Freshwater pond Organic-enriched lagoon —

Keratitis-associated contact lens liquid

[30]*

Unassigned Pandoraviridae

References (* in Supp. Mat. 1)

[22]* [23]* [25]* [6]* [24]* [25]* [26]*

Pithovirus

Unassigned

Pithovirus sibericum

Russia

Permafrost



[31]*

Mollivirus

Unassigned

Mollivirus sibericum

Russia

Permafrost



[18]

shown that Mimivirus relatives are probably able to infect eukaryotic unicellular algae, from very distant phyla (Haptophyta, Chlorophyta, Dinoflagellata, Rhizaria) [8,13,14]. These findings reinforced the notion of Mimiviridae as a major component of the marine viral world, one able to play a similar role as bacteriophages Current Opinion in Microbiology 2016, 31:34–43

at the eukaryotic level by regulating planktonic populations and releasing nutrients in the environment, with similar impacts on global geochemical cycles [15]. Metagenomic surveys on GV are much scarcer for continental environments, but again tend to confirm the www.sciencedirect.com

www.sciencedirect.com

Table 2 Examples of amoeba-infecting giant viruses detected by metagenomics Environment

Sample type

Temperate Freshwater environments lakes

Location

Canada

Pre-treatment

Sequencing technology

Raw reads

Abundance of giant virus reads

Viral groups

References (* in Supp. Mat. 1)

High-speed clarification, PEG precipitation, filtration (0.22 m), ultracentrifugation Filtration (0.8 m), TFF (0.1 m) Filtration (20 m, 3 m, 0.8 m, 0.1 m)

Illumina HiSeq

43 495 558

0.01–0.23%

Marseilleviridae

[32]*

Sanger

ND

ND

Mimiviridae

[9]

Sanger

1947 (polB fragments)

5.90%

Mimiviridae

[7]

Filtration (200 m, 20 m, 1.6 m, 0.22 m), TFF

Roche 454

8 162 564

0.2–5.6%

Unassigned [12] Mimiviridae (CroV-like viruses)

ND

Roche 454

600 427

0.005–0.13%

[37]

Seawater

Sargasso sea

Seawater

Seawater

Northwest Atlantic to Eastern Tropical Pacific Atlantic Ocean, Mediterranean sea, Red sea, Arabian sea, Indian Ocean Patagonian Shelf

Seawater

Indian Ocean

Filtration (20 m, 3 m, 0.8 m, 0.1 m), TFF, nucleases digestion, sucrose ultracentrifugation

Roche 454, Sanger

3 637 921

0.3–1.4%

Mimiviridae, Marseilleviridae + CroV virophage Mimiviridae

Extreme Desert environments freshwater pond

Mauritania

Roche 454

324 603

0.84–3.18%

Mimiviridae

[16]

Hyperarid desert soil

Antarctica

Filtration (0.45 m), PEG precipitation, CsCl ultracentrifugation, DNAse digestion Low-speed clarification, filtration (0.22 m), ultracentrifugation, DNAse/RNAse digestion

Illumina MiSeq

5 394 546

0.002%

Mimiviridae + Sputnik virophage

[19]

Seawater

[11]

Viral metagenomics Halary et al. 37

Current Opinion in Microbiology 2016, 31:34–43

Environment

Corals

Sample type

Location

Porites astreoides Panama/ Carribean sea

Montastraea annularis

US Virgin Islands

Montastraea USA (coral cavernosa and nursery) Symbiodinium sp.

Human

Coprolite

Belgium

Nasopharyngeal aspirates

Sweden

Plasma

Canada

Blood

France

Pre-treatment

Sequencing technology

Low-speed clarification, Roche 454 filtration (8 m), highspeed centrifugation, Percoll gradient Roche 454 Corals: filtration (0.22 m), CsCl ultracentrifugation/ Seawater: TFF (100 kDa), filtration (0.22 m) Roche 454 Filtration (1 m), CsCl ultracentrifugation, filtration (0.22 m), DNAse/RNAse digestion Low-speed clarification, filtration (0.8 m, 0.45 m, 0.22 m), CsCl ultracentrifugation, DNAse digestion Filtration (0.45 m, 0.22 m), ultracentrifugation, DNAse digestion Low-speed clarification, filtration (0.2 m), TFF, DNAse/RNAse digestion Low-speed clarification, filtration (0.45 m), CsCl ultracentrifugation, DNAse digestion

Raw reads

Abundance of giant virus reads

Viral groups

References (* in Supp. Mat. 1)

316 279

0.03%

Mimiviridae

[20]

1 044 444

ND

Mimiviridae

[33]*

175 044

0.16%

Mimiviridae

[13]

Roche 454

30 654

0.06%

Mimiviridae

[34]*

Roche 454

286 696

0.0007%

Mimiviridae

[23]

Illumina Genome Analyzer II

20 000 000 (average)

0.0005% (average)

Mimiviridae

[24]

Roche 454

20 238

1.65%

Marseilleviridae

[5]

38 Special section: megaviromes

Current Opinion in Microbiology 2016, 31:34–43

Table 2 (Continued )

www.sciencedirect.com

www.sciencedirect.com

Table 2 (Continued ) Environment

Other mammals

Arthropods

Sample type

Location

Dromedaries feces

Dubai

Bats brain, liver, lung Bats feces

France China

Rodents feces

USA

Rhipicephalus sp. ticks

China

Pediculus humanus body lice

France

Culicoides sp. biting midges

Senegal

Pre-treatment

Sequencing technology

Raw reads

Abundance of giant virus reads

Viral groups

References (* in Supp. Mat. 1)

High-speed clarification, filtration (0.45 m), DNAse/RNAse digestion ND

Illumina HiSeq

29 247 514

ND

Mimiviridae

[35]*

Illumina HiSeq

70 652 939

2–3%

Mimiviridae

[36]*

High-speed clarification, filtration (0.45 m, 0.22 m), ultracentrifugation, DNAse/RNAse digestion High-speed clarification, filtration (0.45 m), DNAse/RNAse digestion

Solexa

8 746 417

0.00008%

Mimiviridae

[37]*

Roche 454

1 441 930

0.04%

Mimiviridae

[38]*

Ion Torrent

10 619 672

0.009%

Mimiviridae

[39]*

Illumina MiSeq

9 501 813

0.0001%

Mimiviridae

[40]*

Illumina MiSeq

5 961 182

0.48%

Mimiviridae, Pandoraviridae, Faustovirus

[28]*

High-speed clarification, filtration (0.45 m) High-speed clarification, filtration (0.45 m), DNAse/RNAse digestion, sucrose ultracentrifugation High-speed clarification, filtration (0.45 m), DNAse/RNAse digestion, sucrose ultracentrifugation

Viral metagenomics Halary et al. 39

Current Opinion in Microbiology 2016, 31:34–43

CroV: Cafeteria roenbergensis virus; CsCl: Cesium chloride; PEG: Polyethylene glycol; polB: B-family DNA polymerase genes; TFF: Tangential flow filtration

40 Special section: megaviromes

Figure 2

(a)

Gene family ID

Description

Mimiviridae

Pandoraviridae

Marseilleviridae

4

Serine/Threonine protein kinase

8/16

3/3

2/4

1/1

1/1

2/16

2/3

3/4

0/1

0/1

1/3

10/16

3/3

2/4

1/1

1/1

1/3 1/3

22 26

4/16

0/3

3/4

0/1

1/1

VV A32-like virion packaging ATPase

3/16

3/3

3/4

0/1

1/1

0/3

44

DNA-directed RNA polymerase II subunit RPB1 (b)

4/16

1/3

2/4

1/1

1/1

1/3 0/3

VLTF3-like transcription factor

4/16

0/3

2/4

1/1

1/1

51

ERV-family thiol oxidoreductase (c)

6/16

3/3

2/4

1/1

1/1

0/3

52

Glycosyltransferase (d) Bifunctional dihydrofolate reductase-thymidylate synthase Hypothetical protein glt 00704 [Moumouvirus goulette] Patatin-like phospholipase

6/16

2/3

2/4

1/1

1/1

0/3

4/16

3/3

2/4

0/1

1/1

1/3

3/16

3/3

2/4

0/1

1/1

0/3

6/16

0/3

2/4

1/1

0/1

1/3

EGF-like domain-containing protein (e) Hypothetical protein glt 00124 [Moumouvirus goulette] VV A18 helicase

4/16

3/3

2/4

0/1

1/1

0/3

5/16

3/3

3/4

1/1

1/1

0/3

55 57 59 64 75 77 82 83

(d)

85 87

(e)

Thioredoxin

50

53

(c)

DNA-directed RNA polymerase II subunit RPB2 (a)

1/3

39

35

(b)

Ubiquitin

Pithovirus Mollivirus Faustovirus

NAD-dependent amine oxidase Rossmann-fold nucleotide-binding protein/hypothetical protein Ribonucleotide-diphosphate reductase large chain Metal dependent phosphohydrolase Ribonucleoside-diphosphate reductase small chain

5/16

0/3

4/4

1/1

1/1

1/3

5/16

3/3

2/4

0/1

1/1

1/3

3/16

2/3

2/4

1/1

1/1

0/3

4/16

3/3

0/4

1/1

0/1

1/3

5/16

3/3

2/4

0/1

1/1

0/3

5/16

3/3

2/4

1/1

0/1

1/3 0/3

90

Alkylated DNA repair protein

4/16

3/3

2/4

1/1

0/1

93

Uracil-DNA glycosylase family 1

3/16

3/3

2/4

1/1

0/1

1/3

94

Putative endonuclease of the XPG family

4/16

0/3

2/4

1/1

1/1

1/3

99

DNA topoisomerase II

5/16

0/3

3/4

1/1

0/1

1/3

108

Deoxyuridine 5'-triphosphate nucleotidohydrolase

1/16

3/3

1/4

1/1

1/1

1/3

4/16

0/3

2/4

1/1

0/1

1/3

3/16

3/3

0/4

1/1

1/1

0/3

4/16

3/3

0/4

1/1

1/1

0/3

119

Endonuclease IV Hypothetical protein CE11 00990 [Megavirus 124 courdo11] 126 Hypothetical protein ps 2217 [Pandoravirussalinus]

Current Opinion in Microbiology

AIGV gene clusters conserved in 4 AIGV families. These clusters were obtained using a gene similarity network approach. All protein sequences of 23 AIGV genomes were compared all together using EGN [46] according to the following parameters: BLASTP alignment, E-value  1e 03, identity 20% of the smallest sequence. Obtained networks were further analyzed in Cytoscape to extract cliques (i.e. densely connected network regions whose nodes most likely represent orthologs or ‘recent paralogs’) using MCL clustering [47,48]. Left panel: examples of 5 networks, where a node represents a gene and an edge a significant similarity between 2 nodes. These clusters constitute good candidates for AIGV marker genes. Right panel: List of ortholog clusters (fasta files of each cluster are provided in the Supplementary Material 2), functional annotation and numbers of representatives of each AIGV family.

worldwide distribution of AIGVs, since they have been found in freshwater ponds in the Sahara as well as in summer Antarctic lakes (Mimiviridae) [16,17] and in Siberian permafrost (Pithovirus, Mollivirus, Mimiviridae) [18]. In addition, Antarctic hyperarid soils harbor a viral community structure comparable to that of marine ecosystems, with Mimiviridae and Phycodnaviridae representing the major component of the eukaryotic viral fraction [19], again highlighting their probable major role for nutrient and protist cycling in the environment. In animals, the presence of AIGV sequences (Mimiviridae-like) was reported in metagenomes associated with Current Opinion in Microbiology 2016, 31:34–43

the holobiont of several corals species (Porites sp., Diploria strigosa) [13,20,21]. Upcoming studies will probably clarify whether these viruses infect the coral animal cells and/ or its dinoflagellate symbionts and whether they play a role in bleaching events. AIGV sequences were also retrieved from non-human mammals such as bats, rodents and arthropods. In humans, AIGV sequences related to Marseilleviridae and Mimiviridae have been detected by metagenomics in blood, stools and respiratory tract mucus, both in pathological and non-pathological conditions [5,22–25]. Furthermore, members of the Marseilleviridae and Mimiviridae families were found associated with cases of idiopathic adenitis in early childhood and pneumonia, www.sciencedirect.com

Viral metagenomics Halary et al. 41

respectively [5,26–28]. Albeit not considered in the scope of this review, which only focuses on AIGV, sequences from another giant virus, Chlorovirus ATCV1 belonging to the Phycodnaviridae family, were detected in human oropharyngeal metagenomes and correlated with a reduction of cognitive function [29]. In these studies, metagenomics proved to be, if not a bona fide diagnostic technique, a useful tool for novel viral strain characterization, and a first step before isolation attempts. All these results indicate that AIGV surveys in humans and other animals urgently require further investigations to define the contribution of AIGVs to the human/animal virome and their potential effects on health and disease. Lastly, ecological information about giant virus distribution may also be indirectly inferred from the study of their viral parasites, the virophages. Sequences of virophages (usually the major capsid gene) have been found in metagenomic data from several environmental samples (e.g. rumen gut [30], Antarctic [31] and Yellowstone Lakes [32,33], leading to further efforts to assemble their genomes from these datasets. These studies increased the high genomic and phylogenetic diversity of the virophages (12 new genotypes), and the co-occurrence of different genotypes in several habitats with different physio-chemical conditions (mesophilic or thermophilic), suggest that as yet uncultured—and potentially numerous—giant viruses inhabit such environments.

Current limitations of metagenomics for AIGV studies Metagenomic studies have broadened our knowledge of AIGV diversity and ecology, and proven valuable in complementing amoeba co-culture isolation approaches, to form the image of globally-distributed viruses in numerous and highly divergent environments displaying a broader host-range than previously suspected. However, for many reasons, doubts remain about the completeness of our actual view of the AIGV world. Because AIGV represents a fairly recent discovery in virology (only 12 years since the first Mimivirus description) and due to the limited number of research teams in the field, metagenomic studies dedicated to these viruses are very scarce (Figure 1, Table 2). In viral metagenomics, descriptions of sequencing data rarely focus on AIGV-sequences, which, if not completely ignored from the analyses, are often roughly classified as ‘other viral sequences’. This may result in and contribute to our incomplete knowledge of the real ecological/etiological impact and abundance of GVs in the environment. A fundamental bias in metagenomics is directly related to the protocols used for sample preparation. Indeed, most viral metagenomic studies rely on viral particle purification procedures, including a filtration step with 0.2 or www.sciencedirect.com

0.45 mm pore diameter filters. Thus, viruses with capsid larger than the selected porosity are systematically excluded from the preparation [34]. The low number of GV reads that is sometimes reported (e.g. [22,23]) in viral metagenomes is more likely to reflect a ‘contamination’ of the ‘nano-viral’ fraction by GV nucleic acids contained in the cellular fraction than show a reliable and genuinely low GV relative abundance. An alternative is to search for GV sequences in the microbial fraction but this may be hampered by the comparatively large abundance of nonviral reads (e.g. bacterial). The mosaic nature along with the large number of ORFans characterizing the GV genomes (reviewed in [35]) may also prevent confident annotations of GVs in the microbial fraction. Future metagenomic works interested by GVs should improve viral enrichment using protocols that take into account large viral particles or a microbial-specific depletion step. Among potentially promising alternatives to filtration, we could cite the use of bacteriolytic compounds/alcohol treatment followed by nuclease digestion [36], bacterial DNA depletion using ad hoc bacterial DNA library, or dsDNA virome enrichment by flow cytometry [37]. At the bioinformatic level, several pipelines are used to retrieve AIGV sequences in metagenomes, based on two interconnected steps. The first one consists of finding AIGV candidate sequences in metagenomic databases. This step relies on similarity searches using local alignment algorithms (usually BLAST [38]). This approach has the advantage of searching on custom databases that include every known AIGV sequence, that is AIGV core genes, as well as AIGV family specific genes and genuine singletons (genes specific to one or a few known strains). However, these approaches are time-consuming and often miss distant homologs [39]. On the other hand, gene families, including ortholog families, common to all (or almost all) AIGVs can be used to design profile hidden Markov models (HMM) from multiple sequence alignments [7,12,40,41]. This allows for a rapid search of putatively more divergent homologs in databases, with fewer false positive results [42,43]. In Figure 2, we present an updated list of the AIGV gene families including the most recently sequenced genomes (Pandoraviridae, Pithovirus, Mollivirus and Faustovirus) that could be used as baits for AIGV HMM-based searches. The second step corresponds to validation of the AIGV candidate sequences. In general, the AIGV assignment of a sequence is deduced from its best hit against an exhaustive reference database, together with the results obtained using a Lowest Common Ancestor algorithm [12,44]. Again, this approach may be limited by the presence of potentially mis-annotated AIGV sequences in the databases [45] and by the large number of ORFans and genes with homologs in cellular organisms in the AIGV genomes. An alternative involves mapping the candidate to a precomputed phylogenetic tree [7,12,41]. Current Opinion in Microbiology 2016, 31:34–43

42 Special section: megaviromes

These approaches however, highlight a further limitation of giant virus searching which is a limitation of metagenomics itself: a reliable sequence annotation depends on its similarity with a previously identified reference sequence, which is still lacking for most microorganisms and viruses on Earth. Consequently, metagenomics results remain ultimately dependent on successful isolation and genome sequencing efforts.

Popgeorgiev N, Boyer M, Fancello L, Monteil S, Robert C, Rivet R, Nappez C, Azza S, Chiaroni J, Raoult D et al.: Marseillevirus-like virus recovered from blood donated by asymptomatic humans. J Infect Dis 2013, 208:1042-1050. The authors detected Marseillevirus sequences in human blood samples from healthy donors by metagenomics. Presence of viral particles and viral nucleic acids in blood was confirmed by TEM and by fluorescent in situ hybridization, respectively. Authors also detected antibodies against Marseillevirus in blood and provided the proof that Marseillevirus can infect T-lymphocytes in vitro.

5. 

6.

Petro TM, Agarkova IV, Zhou Y, Yolken RH, Van Etten JL, Dunigan DD: Response of mammalian macrophages to challenge with the chlorovirus acanthocystis turfacea chlorella virus 1. J Virol 2015, 89:12096-12107.

7.

Monier A, Claverie JM, Ogata H: Taxonomic distribution of large DNA viruses in the sea. Genome Biol 2008, 9:R106.

8.

Monier A, Larsen JB, Sandaa RA, Bratbak G, Claverie JM, Ogata H: Marine mimivirus relatives are probably large algal viruses. Virol J 2008, 5:12.

9.

Ghedin E, Claverie JM: Mimivirus relatives in the Sargasso sea. Virol J 2005, 2:62.

Conclusion By overcoming cultivation limitations, metagenomics has proven to be a reliable approach accelerating our knowledge in microbial ecology and clinical studies. As a tool for dedicated amoeba-infecting giant virus studies, metagenomics allow us, firstly, to search for the presence of giant viruses in many diverse environments worldwide, including the human body, secondly, to better characterize their genomic diversity, thirdly, to reveal a potential role in biogeochemical cycles in oceans, and finally to broaden their host-range to human/animals and/or their eukaryotic symbionts. Although the effectiveness of metagenomics is closely related to the knowledge contributed by more traditional isolation techniques, the contrary is also true, as the characterization of unknown giant viruses and/or their enclosed parasites (virophages) highlights environments in which to focus isolation efforts.

Acknowledgements The authors would like to thank Dr. Nicolas Rascovan and Dr. Raja Duraisamy, for their critical comments on earlier versions of the manuscript and Jonathan Verneau for sharing his AIGV sequence datasets. This work was conducted under the ANR-13-JSV6-0004 awarded to Dr. Christelle Desnues.

Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j. mib.2016.01.005.

References and recommended reading Papers of particular interest, published within the period of review, have been highlighted as:  of special interest  of outstanding interest 1.

2.

3.

4.

Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, Rohwer F: Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci U S A 2002, 99:14250-14255. Colson P, De Lamballerie X, Yutin N, Asgari S, Bigot Y, Bideshi D, Cheng X, Federici B, Van Etten J, Koonin E et al.: ‘Megavirales’, a proposed new order for eukaryotic nucleocytoplasmic large DNA viruses. Arch Virol 2013, 158:2517-2521. La Scola B, Desnues C, Pagnier I, Robert C, Barrassi L, Fournous G, Merchat M, Suzan-Monti M, Forterre P, Koonin E et al.: The virophage as a unique parasite of the giant mimivirus. Nature 2008, 455:100-104. Ghigo E, Kartenbeck J, Lien P, Pelkmans L, Capo C, Mege JL, Raoult D: Ameobal pathogen mimivirus infects macrophages through phagocytosis. PLoS Pathog 2008, 4:e1000087.

Current Opinion in Microbiology 2016, 31:34–43

10. Kristensen DM, Mushegian AR, Dolja VV, Koonin EV: New dimensions of the virus world discovered through metagenomics. Trends Microbiol 2010, 18:11-19. 11. Williamson SJ, Allen LZ, Lorenzi HA, Fadrosh DW, Brami D, Thiagarajan M, McCrow JP, Tovchigrechko A, Yooseph S, Venter JC: Metagenomic exploration of viruses throughout the Indian Ocean. PLoS ONE 2012, 7:e42047. 12. Hingamp P, Grimsley N, Acinas SG, Clerissi C, Subirana L,  Poulain J, Ferrera I, Sarmento H, Villar E, Lima-Mendez G et al.: Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes. ISME J 2013, 7:1678-1695. This article presents a metagenomic study of 17 ocean samples from the Tara Oceans expedition, with a focus on NCLVD by using 16 NCLDV core genes as baits and phylogenetic mapping. The authors confirmed that NCLDV, and especially members from the Mimiviridae and Phycodnaviridae families, were the most abundant NCLDV in sea water. They also proposed marine oomycetes as potential hosts based on taxonomic association analyses of their samples and on the detection of horizontal gene transfer events between GV and oomycetes in public genomic databases. 13. Correa AM, Welsh RM, Vega Thurber RL: Unique  nucleocytoplasmic dsDNA and +ssRNA viruses are associated with the dinoflagellate endosymbionts of corals. ISME J 2013, 7:13-27. This study characterized the viromes of the scleractinia coral Montastraea cavernosa and its dinoflagellate symbiont Symbiodinium sp., through genomic analyses of cDNA and EST libraries. NCLDVs appeared as major component of the Symbiodinium virome. 14. Blanc G, Gallot-Lavallee L, Maumus F: Provirophages in the Bigelowiella genome bear testimony to past encounters with giant viruses. Proc Natl Acad Sci U S A 2015, 112:E5318-E5326. 15. Suttle CA: Marine viruses – major players in the global ecosystem. Nat Rev Microbiol 2007, 5:801-812. 16. Fancello L, Trape S, Robert C, Boyer M, Popgeorgiev N, Raoult D, Desnues C: Viruses in the desert: a metagenomic survey of viral communities in four perennial ponds of the Mauritanian Sahara. ISME J 2013, 7:359-369. 17. Lopez-Bueno A, Tamames J, Velazquez D, Moya A, Quesada A, Alcami A: High diversity of the viral community from an Antarctic lake. Science 2009, 326:858-861. 18. Legendre M, Lartigue A, Bertaux L, Jeudy S, Bartoli J, Lescot M, Alempic JM, Ramus C, Bruley C, Labadie K et al.: In-depth study of Mollivirus sibericum, a new 30,000-y-old giant virus infecting Acanthamoeba. Proc Natl Acad Sci U S A 2015, 112:E5327-E5335. 19. Zablocki O, van Zyl L, Adriaenssens EM, Rubagotti E, Tuffin M, Cary SC, Cowan D: High-level diversity of tailed phages, eukaryote-associated viruses, and virophage-like elements in the metaviromes of antarctic soils. Appl Environ Microbiol 2014, 80:6888-6897. www.sciencedirect.com

Viral metagenomics Halary et al. 43

20. Wegley L, Edwards R, Rodriguez-Brito B, Liu H, Rohwer F: Metagenomic analysis of the microbial community associated with the coral Porites astreoides. Environ Microbiol 2007, 9:2707-2719. 21. Vega Thurber RL, Barott KL, Hall D, Liu H, Rodriguez-Mueller B, Desnues C, Edwards RA, Haynes M, Angly FE, Wegley L et al.: Metagenomic analysis indicates that stressors induce production of herpes-like viruses in the coral Porites compressa. Proc Natl Acad Sci U S A 2008, 105:18413-18418. 22. Kim MS, Park EJ, Roh SW, Bae JW: Diversity and abundance of single-stranded DNA viruses in human feces. Appl Environ Microbiol 2011, 77:8062-8070. 23. Lysholm F, Wetterbom A, Lindau C, Darban H, Bjerkner A, Fahlander K, Lindberg AM, Persson B, Allander T, Andersson B: Characterization of the viral microbiome in patients with severe lower respiratory tract infections, using metagenomic sequencing. PLoS ONE 2012, 7:e30875. 24. Law J, Jovel J, Patterson J, Ford G, O’Keefe S, Wang W, Meng B, Song D, Zhang Y, Tian Z et al.: Identification of hepatotropic viruses from plasma using deep sequencing: a next generation diagnostic tool. PLoS ONE 2013, 8:e60595. 25. Colson P, Fancello L, Gimenez G, Armougom F, Desnues C, Fournous G, Yoosuf N, Million M, La Scola B, Raoult D: Evidence of the megavirome in humans. J Clin Virol 2013, 57:191-200. 26. Saadi H, Pagnier I, Colson P, Cherif JK, Beji M, Boughalmi M, Azza S, Armstrong N, Robert C, Fournous G et al.: First isolation of Mimivirus in a patient with pneumonia. Clin Infect Dis 2013, 57:e127-e134. 27. La Scola B, Marrie TJ, Auffray JP, Raoult D: Mimivirus in pneumonia patients. Emerg Infect Dis 2005, 11:449-452. 28. Raoult D, Renesto P, Brouqui P: Laboratory infection of a technician by mimivirus. Ann Intern Med 2006, 144:702-703. 29. Yolken RH, Jones-Brando L, Dunigan DD, Kannan G, Dickerson F, Severance E, Sabunciyan S, Talbot CC Jr, Prandovszky E, Gurnon JR et al.: Chlorovirus ATCV-1 is part of the human oropharyngeal virome and is associated with changes in cognitive functions in humans and mice. Proc Natl Acad Sci U S A 2014, 111:16106-16111. 30. Yutin N, Kapitonov VV, Koonin EV: A new family of hybrid virophages from an animal gut metagenome. Biol Direct 2015, 10:19.

34. Conceicao-Neto N, Zeller M, Lefrere H, De Bruyn P, Beller L, Deboutte W, Yinda CK, Lavigne R, Maes P, Van Ranst M et al.: Modular approach to customise sample preparation procedures for viral metagenomics: a reproducible protocol for virome analysis. Sci Rep 2015, 5:16532. 35. Abergel C, Legendre M, Claverie JM: The rapidly expanding  universe of giant viruses: Mimivirus, Pandoravirus, Pithovirus and Mollivirus. FEMS Microbiol Rev 2015, 39:779-796. An exhaustive review on what we currently know about morphology, replication cycle, genomics and evolution of four of the largest AIGVs: Mimivirus, Pandoravirus, Pithovirus and Mollivirus. 36. Slimani M, Pagnier I, Boughalmi M, Raoult D, La Scola B: Alcohol disinfection procedure for isolating giant viruses from contaminated samples. Intervirology 2013, 56:434-440. 37. Martinez Martinez J, Swan BK, Wilson WH: Marine viruses, a  genetic reservoir revealed by targeted viromics. ISME J 2014, 8:1079-1088. A virus-targeted metagenomic study, passing through the filtration limitation on a natural sample by using fluorescent flow cytometry. Read libraries were conspicuously enriched in viral sequences compared to those obtained by traditional approaches. This protocol seems particularly suitable for NCLDV studies. 38. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410. 39. Grundy WN: Homology detection via family pairwise search. J Comput Biol 1998, 5:479-491. 40. Yutin N, Wolf YI, Raoult D, Koonin EV: Eukaryotic large nucleocytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution. Virol J 2009, 6:223. 41. Mozar M, Claverie JM: Expanding the Mimiviridae family using asparagine synthase as a sequence bait. Virology 2014, 466– 467:112-122. 42. Eddy SR: Profile hidden Markov models. Bioinformatics 1998, 14:755-763. 43. Remmert M, Biegert A, Hauser A, Soding J: HHblits: lightningfast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2012, 9:173-175. 44. Huson DH, Auch AF, Qi J, Schuster SC: MEGAN analysis of metagenomic data. Genome Res 2007, 17:377-386. 45. Sharma V, Colson P, Giorgi R, Pontarotti P, Raoult D: DNAdependent RNA polymerase detects hidden giant viruses in published databanks. Genome Biol Evol 2014, 6:1603-1610.

31. Yau S, Lauro FM, DeMaere MZ, Brown MV, Thomas T, Raftery MJ, Andrews-Pfannkoch C, Lewis M, Hoffman JM, Gibson JA et al.: Virophage control of antarctic algal host–virus dynamics. Proc Natl Acad Sci U S A 2011, 108:6163-6168.

46. Halary S, McInerney JO, Lopez P, Bapteste E: EGN: a wizard for construction of gene and genome similarity networks. BMC Evol Biol 2013, 13:146.

32. Zhou J, Sun D, Childers A, McDermott TR, Wang Y, Liles MR: Three novel virophage genomes discovered from Yellowstone Lake metagenomes. J Virol 2015, 89:1278-1285.

47. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13:2498-2504.

33. Zhou J, Zhang W, Yan S, Xiao J, Zhang Y, Li B, Pan Y, Wang Y: Diversity of virophages in metagenomic data sets. J Virol 2013, 87:4225-4236.

48. Morris JH, Apeltsin L, Newman AM, Baumbach J, Wittkop T, Su G, Bader GD, Ferrin TE: clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics 2011, 12:436.

www.sciencedirect.com

Current Opinion in Microbiology 2016, 31:34–43

Viral metagenomics: are we missing the giants?

Amoeba-infecting giant viruses are recently discovered viruses that have been isolated from diverse environments all around the world. In parallel to ...
632KB Sizes 0 Downloads 18 Views