JVI Accepted Manuscript Posted Online 17 February 2016 J. Virol. doi:10.1128/JVI.02647-15 Copyright © 2016 Rodriguez-Roche et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.
1
Title: Increasing clinical severity during a dengue virus type 3 Cuban epidemic: deep
2
sequencing of evolving viral populations
3
Running title: Evolving DENV-3 populations, Cuba, 2001-2002
4
Rosmari Rodriguez-Roche1*, Hervé Blanc2, Antonio V. Bordería2,3, Gisell Díaz1, Rasmus
5
Henningsson2,3,4, Daniel Gonzalez5, Emidalys Santana1, Mayling Alvarez1, Osvaldo Castro5,
6
Magnus Fontes3, Marco Vignuzzi2 and Maria G. Guzman1
7
1
8
Center for the Study of Dengue and its Vector, Autopista Novia del Mediodia Km 61/2, La Lisa,
9
PO Box 601 Marianao 13, La Habana, Cuba
Virology Department, Pedro Kouri Institute of Tropical Medicine, PAHO/WHO Collaborating
10
2
11
Roux, 75724 Paris cedex 15, France
12
3
13
15, France
14
4
Lund University, Centre for Mathematical Sciences, Box 118, 221 00 Lund, Sweden
15
5
Department of Medicine, Pedro Kouri Institute of Tropical Medicine, Autopista Novia del
16
Mediodía Km 61/2, La Lisa, PO Box 601 Marianao 13, La Habana, Cuba
17
*Corresponding Author:
18
Prof. Rosmari Rodriguez-Roche, PhD, Department of Virology, PAHO/WHO Collaborating
19
Center for the Study of Dengue and its Vector, Pedro Kouri Institute of Tropical Medicine(IPK).
20
PO Box 601 Marianao 13, Havana, Cuba Tel: 53-7-2553561 Fax: 53-7-2046051
21
Email:
[email protected] 22
Word count
23
Abstract:250 words; Importance:129 words; Full text:4818 words
Institut Pasteur, Viral Populations and Pathogenesis Unit, CNRS UMR 3569, 28 rue du Dr
Institut Pasteur, International Group for Data Analysis, 28 rue du Dr. Roux, 75724 Paris cedex
1
24
ABSTRACT
25
During the DENV-3 epidemic occurred in Havana in 2001-2002, severe disease was associated
26
with the infection sequence DENV-1/DENV-3, whilst the sequence DENV-2/DENV-3 was
27
associated with mild/asymptomatic infections. To determine the role of the virus in the increasing
28
severity demonstrated during the epidemic serum samples collected at different point times were
29
studied. A total of 22 full-length sequences were obtained using a deep sequencing approach.
30
Bayesian phylogenetic analysis of consensus sequences revealed that two DENV-3 lineages were
31
circulating in Havana at that time, both grouped within genotype III. The predominant lineage is
32
closely related to Peruvian and Ecuadorian strains, whilst the minor lineage is related to
33
Venezuelan strains. According to consensus sequences, relatively few non-synonymous
34
mutations were observed; only one was fixed during the epidemic at position 4380 in the NS2B
35
gene. Intra-host genetic analysis indicated that a significant minor population was selected and
36
became predominant towards the end of the epidemic. In conclusion, greater variability was
37
detected during the epidemic’s progression in terms of significant minority variants, particularly
38
in the non-structural genes. An increasing trend of genetic diversity towards the end of the
39
epidemic was only observed for synonymous variant allele rates, with higher variability in
40
secondary cases. Remarkably, significant intra-host genetic variation was demonstrated within
41
the same patient during the course of secondary infection DENV-1/DENV-3, including changes
42
in the structural proteins PrM and E. Therefore, the dynamic of evolving viral populations in the
43
context of heterotypic antibodies could be related to the increasing clinical severity observed
44
during the epidemic.
45
2
46
IMPORTANCE
47
Based on the evidence that DENV fitness is context dependent, our research has focused on the
48
study of viral factors associated with intra-epidemic increasing severity in a unique
49
epidemiological setting. Here, we investigated the intra-host genetic diversity in acute human
50
samples collected at different time points during the DENV-3 epidemic occurred in Cuba, 2001-
51
2002 using a deep sequencing approach. We concluded that greater variability in significant
52
minor populations occurred as the epidemic progressed, particularly in the non-structural genes,
53
with higher variability observed in secondary infected cases. Remarkably, for the first time
54
significant intra-host genetic variation was demonstrated within the same patient during the
55
course of secondary infection DENV-1/DENV-3, including changes in structural proteins. These
56
findings indicate that high-resolution approaches are needed to unravel molecular mechanisms
57
involved in dengue pathogenesis.
58 59
Key words: dengue 3; viral evolution; intra-host genetic variation; increasing severity; Cuba
60
3
61
INTRODUCTION
62
Dengue viruses cause the most important arthropod-borne viral disease in humans with
63
latest estimates of 390 million dengue infections per year, of which 96 million manifest some
64
level of disease severity (1). This figure is more than three times the dengue burden estimate of
65
the World Health Organization(2).Latin America has progressively evolved from a low dengue-
66
endemic to a hyperendemic region with local transmission of the four dengue virus serotypes
67
(DENV-1 to 4) in practically all countries(3). DENVs are assigned to the genus Flavivirus in the
68
family Flaviviridae. The genomes of flaviviruses comprise a single stranded RNA molecule
69
encoding three structural proteins, the capsid (C), pre-membrane/membrane (PrM/M), and
70
envelope (E), and seven non-structural (NS) proteins, NS1, NS2A, NS2B, NS3, NS4A, NS4B,
71
and NS5 (4). Infection with any DENV serotype may present as an asymptomatic infection or as
72
a symptomatic infection ranging from mild to severe illness: dengue fever (DF) and dengue
73
hemorrhagic fever/dengue shock syndrome (DHF/DSS) (5) or dengue and severe dengue,
74
according to the new dengue case classification (2).
75
The etiology of DHF/DSS has been subject of research for over 40 years (6, 7). In Cuba,
76
secondary infection has been implicated as the most relevant risk factor to develop severe
77
disease, combined with the introduction of DENV strains of Asian origin. Likewise, other risk
78
factors, including age, genetic background, and chronic diseases like bronchial asthma, sickle cell
79
anemia and diabetes have also been implicated in dengue severity (8, 9). Notably, during two
80
severe epidemics (1981 and 1997) characterized by the circulation of only one serotype (DENV-
81
2), four and twenty years respectively after the massive DENV-1 epidemic, a marked month-to-
82
month increase in clinical severity was observed in secondary infected cases (8, 10, 11).
83 4
84
In June 2001, the nationwide dengue case surveillance system identified a DENV-3
85
outbreak in Havana city, which eventually involved 12 889 confirmed cases, including 78
86
DHF/DSS and three fatalities (12). Herein, we report that significant monthly increases in the
87
proportion of DHF/DSS also occurred during this epidemic, 24 years after the DENV-1 epidemic
88
and 20 years after one caused by DENV-2. Since the epidemic was circumscribed to Havana city,
89
which has an admixed population with homogenous ethnic composition, and because all severe
90
cases were adults, the hypothesis related to the outbreak moving into different regions with
91
different population demographics to explain the increasing severity was underestimated.
92
Likewise, host factors did not appear to explain this increase, because it is not logical to assume
93
that the most susceptible individuals would all be infected towards the end of the epidemic.
94
However, the theory that cross-immunity could play a significant role in shaping viral
95
population diversity, selecting for more fit viruses towards the end of the epidemics that produce
96
severe dengue is plausible. Previous studies in this particular context have suggested that the
97
phenomenon of increasing clinical severity with the epidemic's progression could be related to
98
temporary changes that occur in the virus causing the epidemic (10, 13, 14). In this regard, little
99
is known about the extent of intra-host genetic diversity of DENV and immune status, and its
100
implications for dengue pathogenesis and disease severity. Initial studies using cloning
101
techniques demonstrated higher levels of DENV-3 intra-host genetic diversity in patients,
102
compared to mosquitoes (15). In addition, the extent and pattern of DENV-1 diversity during
103
acute infection was related to disease outcome (16), while no relationship was found for the same
104
serotype in a different scenario (17). Later on, studies using deep sequencing approaches,
105
although indifferent epidemiological contexts, have shown low genetic diversity in humans(18,
5
106
19). Most recent papers on dengue intra-host diversity have been focused on viral population
107
variations during human/mosquito host switching (20-22).
108
In the present study, we explored the intra-host DENV-3 genetic diversity in serum samples
109
collected at different time points during the 2001-2002 Cuban epidemic, using whole-genome
110
amplification and next-generation sequencing to determine the role of the virus in the increasing
111
severity. Our results, demonstrate that changes in the viral population occurring with the
112
epidemic’s progression could have an impact on viral fitness.
113
MATERIALS AND METHODS
114
Epidemic characterization in terms of dengue severity. The proportion of DHF/DSS
115
per dengue case was compared every four epidemiological weeks as an expression of increasing
116
clinical severity, based on dengue confirmed cases figures that were notified during the epidemic
117
period through the laboratory surveillance system (12). Analysis for linear trend in proportion
118
was done by Chi-Square test using Epinfo version 3.2. Odds Ratios (OR) were calculated taking
119
the weeks 33-36 as reference because no DHF/DSS cases were observed in preceding weeks.
120
Samples. Twenty-two DENV-3 acute positive serum samples corresponding to 21
121
patients collected at different time points during the 2001-2002 Cuban epidemic were utilized for
122
the molecular characterization (Table 1). Two samples (Cuba_553_2001 and Cuba_558_2001)
123
correspond to the same patient. All samples had been processed during the epidemic period for
124
viral isolation on C6/36 HT cells and identification by indirect immune fluorescence with
125
monoclonal antibodies (23) then stored at -80°C in the Strain Bank of the National Reference
126
Laboratory of Virology at Pedro Kouri Institute of Tropical Medicine. Informed consent was
127
obtained from all patients at the moment of sample collection. All cases were classified clinically
6
128
at that time as DF or DHF/DSS according to the Guidelines for Control and Prevention of
129
Dengue and Dengue Hemorrhagic Fever in the Americas (24). In addition, the most recent
130
clinical classification (2) was obtained after analysis of data available in the clinical records of
131
the patients. Ten patients included in the study, classified as DF because they did not fulfil the
132
strict WHO criteria to classify them as DHF according to the 1997 guidelines, presented warning
133
signs and required hospitalization. Indeed, they were at risk of severe dengue. However, early
134
hospitalization policies joined to proper clinical management prevented complications. Patients
135
without warning signs treated at home were visited daily by the family doctor to accurately define
136
the final disease outcome.The Institutional Ethical Review Committee of the Pedro Kouri
137
Institute of Tropical Medicine has approved the present study (IRB number:CEI-IPK-13-11).
138
IgG detection. All samples were processed by ELISA inhibition test to determine anti-
139
dengue IgG titres. Samples with less the 1/20 were considered primary infections and samples
140
with titres higher than 1/1280, were considered secondary infections (25).
141
Sequence of infection. Sera from patients in the convalescent phase of the infection were
142
analyzed for neutralizing antibodies to all DENV serotypes using the 50% endpoint plaque
143
reduction neutralization test described (26), with some modifications (27). According to criteria
144
previously established (28), patients with neutralizing antibody titers ≥1:30 to only one DENV
145
serotype were considered to have experienced a primary dengue infection. Patients with
146
neutralizing antibody titers ≥1:30 against two or more serotypes were considered to have
147
experienced a second or third infection.
148
Full-length viral genome amplification. Briefly, viral RNA was extracted from 140 µL
149
of serum sample using the QIAamp viral RNA mini kit (Qiagen, Germany). cDNA was
150
synthesized using the Transcriptor High Fidelity cDNA Synthesis kit (Roche Applied Science, 7
151
Germany) using hexamer random primer 600 μM according to manufacturer's instructions. An
152
aliquot of 3μl cDNA was subjected to PCR using the Expand High Fidelityplus PCR System
153
(Roche Applied Science, Germany) according to manufacturer’s instructions. Two independent
154
DNA libraries (a and b) were constructed using two different sets of primers for each sample.
155
Five pairs of primers were utilized to obtain five overlapping fragments 2–3 kb (F1a-F5a),
156
covering the complete genome of the DENV-3 as previously published (29). In addition, a second
157
set of primers was designed (available from authors on request); these fragments were named
158
(F1b-F5b).
159
Population diversity by deep sequencing. To estimate the population diversity of
160
variants by deep sequencing, PCR fragments were purified via Qiaquick PCR purification kit
161
(Qiagen, Germany), and total DNA was quantified by Pico Green fluorescence (Invitrogen,
162
USA). Amplicons were then fragmented using Fragmentase, linked to Illumina multiplex
163
adapters, clusterized, and sequenced with Illumina cBot and GAIIX technology and analyzed
164
with established deep sequencing data analysis tools. Illumina technology was selected since it is
165
capable of producing enough sequencing data to enable identification of rare variants present in
166
as low as 1:1,000 ratio in samples of modest reference size. Improvements in accuracy were
167
made with the ViVAN (Viral Variant ANalysis) pipeline (30), a robust algorithm based on each
168
variant allele’s initial rate and read qualities, to differentiate between sequencing errors and
169
actual population variants, facilitating accurate variant assessment for DENV populations. To
170
minimize inaccuracy due to sequencing errors, alleles at frequencies >0.1% were analyzed, while
171
ultra-rare variants were discarded.
172
During the sequencing assembly process a closely related reference sequence (DENV-
173
3/EC/BID-V2975/2000) was utilized. This sequence was selected according to the phylogenetic
174
tree constructed using DENV-3 consensus sequences obtained by Sanger method. Once we were 8
175
able to assemble the sequence of the first isolate of the epidemic (Cuba_15_2001), this consensus
176
sequence was used as reference to analyze minor variants that appeared in DENV-3 samples
177
collected at different time points during the Cuban epidemic. A new statistically significant
178
variant was considered unique for a particular sample if it appeared in both independent DNA
179
libraries. Briefly, for each position throughout the viral genome, base identity and their quality
180
scores were gathered. Each variant was determined to be true using a generalized likelihood-ratio
181
test (used to determine the total number of minority variants) and its allele rate was modified
182
according to its covering read qualities based on maximum likelihood estimation. Additionally, a
183
confidence interval was calculated for each allele rate. In order to correct for multiple testing,
184
Benjamini-Hochberg false-discovery rate of 5% was set (31). Different ViVAN output files were
185
utilized for analysis such as synonymous and non-synonymous changes for each significantly
186
variable position, organized by gene and position for the whole viral genome, a battery of metrics
187
including nucleotide substitution matrix, transition/transversion frequencies and variant allele
188
rates. Variant allele rates, is the frequency of significant non-reference alleles (in this case SNPs)
189
present in the sample, be it across the whole viral genome or in a specific gene. This
190
measurement is a proxy to calculate heterogeneity of a specific sample. In addition, three files
191
with pairwise comparisons supplying the mutations found to be different or common among
192
samples as well as the consensus sequences files were utilized (30).
193
Inter-sample cluster analyses was performed by first computing the distance matrix
194
among all samples using root mean square deviation (RMSD) values according to Li et al. (32).
195
Multidimensional scaling (MDS), with Kruskal's stress criterion (33), was performed on the
196
distance matrix to produce a low-dimensional plot of the samples. Dendrograms, based on
197
complete linkage hierarchical clustering of the distance matrix, were also plotted.
9
198
Consensus sequence analysis. Consensus full polyprotein nucleotide sequences of each
199
DENV-3 isolate obtained in the present study were aligned using ClustalX (34) together with
200
relevant sequences retrieved from GenBank (available from the authors on request) such that
201
representative sequences from all the known DENV-3 genotypes were present. From the initial
202
data set, identical sequences and known recombinant sequences (as published by the authors)
203
were removed from the alignments. This produced a total data set of 104 sequences 10 170
204
nucleotides in length. Phylogenetic analyses were performed using Bayesian analysis in MrBayes
205
v3.1.2 (35), with a minimum of 20 million generations and a burn-in of 10%. Stationarity was
206
assessed at effective sample size (ESS>400) using Tracer v1.4.1 (part of the BEAST package)
207
(36).
208
Nucleotide sequence accession numbers. The consensus nucleotide sequences reported
209
in this study are available in GenBank under accession numbers KT726340 to KT726361.
210
RESULTS
211
Increasing clinical severity during the epidemic. The proportions of DHF/DSS per
212
dengue case were compared every four epidemiological weeks as an expression of increasing
213
clinical severity. According to epidemiological data the first dengue case was reported by
214
epidemiological week 22 (beginning of June). However, the first DHF/DSS case occurred by
215
epidemiological week 36 (beginning of September). From weeks 45-48 the number of dengue
216
confirmed cases had a tendency to decrease whilst the proportion of DHF/DSS cases increased
217
notably(p0.1%)ranged from 12 to as high as 48 in both primary and
265
secondary infections without significant differences between these groups (Fig. 3).
266
However, different results were obtained when higher frequency minority variants (>1%)
267
were analyzed (Fig. 4). Still greater variability was observed in the non-structural genes
268
compared with structural genes; but it was noteworthy that patients with secondary infection 12
269
showed greater variability than patients with primary infections. In addition, secondary patients
270
presented minority variants in the structural genes (PrM and E), some of which were non-
271
synonymous. By contrast, patients suffering primary infections only had mostly synonymous
272
minority variants (>1%)in non-structural genes (Table 4 and 5).
273
Inter-sample cluster analyses using RMSD values based on MDS showed greater variability
274
with epidemic’s progression. Samples with similar characteristics were reflected in the plot by
275
their close spatial proximity to each other. Isolates collected at the end of the epidemic were
276
located at the outside of the plotting indicating higher variability (Fig. 5). Dendograms using
277
unique significant minority variants (>0.1%; >0.5% and >1%) showed similar results; isolates
278
collected at the very beginning were closely related and had less genetic variability than late
279
isolates based on RMSD values (Fig. 6).
280
Finally, significant minority variants present in samples Cuba_553_2001 and
281
Cuba_558_2001 both collected from the same patient at day 2 and 4 after fever onset were
282
compared(Table 6). The analysis revealed changes in the viral population structure during the
283
course of a secondary infection that were well supported by our data, since high quality
284
sequences were obtained (see coverage) and similar results were obtained in two independent
285
DNA libraries (a and b). Two silent nucleotide substitutions C/T at position 5371 in NS3 and C/T
286
at position 9142 in NS5 that were found in both samples as the predominant population. Notably,
287
these variants were present as significant minority variants in the first isolate Cuba_15_2001. In
288
addition, seven unique significant minority variants (>1%) were found in sample Cuba_553_2001
289
that were absent in sample Cuba_558_2001 (six of them synonymous). Likewise, six unique
290
significant minority variants (>1%) were found in sample Cuba_558_2001 that were absent in
291
sample Cuba_553_2001 (four of them non-synonymous). Interestingly, significant minority
292
variants in the sample collected at day 2 corresponded exclusively to non-structural genes, whilst 13
293
significant minority variants present in the sample collected at day 4 included non-synonymous
294
changes in PrM, E and NS5 genes. The change E9D in PrM protein was located in the N-terminal
295
of Pr near motif 6,one of the prominent complementary electrostatic patches in the PrM-E
296
heterodimer. The changes M199I and A441V in E protein were located in domain II and the
297
membrane-proximal stem (EH2), respectively. The stem has two predicted amphipathic helices
298
that lie half-buried in the outer leaflet of the viral membrane. For fusion to take place, the stem
299
region must span the length of domain II region. In addition, EH2 can affect the expression and
300
stability of its chaperone prM (38-40). Finally, the change I517V in NS5 protein was located in
301
the RNA-dependent RNA polymerase catalytic domain (41). Simultaneously with the intra-host
302
genetic diversity observed during the course infection in this particular patient, increasing
303
neutralizing titres were observed for DENV-1 (from 1/28 to 1/200) as well for DENV-3 (from
304
0.1%) taking as reference the first isolate of the DENV-3 Cuban epidemic, 2001-2002.
649
Samples are grouped according to type of infection in primary and secondary infections by date
650
of sample collection (See Table 1).
651
Figure 4. Number of positions with unique significant minority variants common for sets a and b
652
(>1%) taking as reference the first isolate of the DENV-3 Cuban epidemic, 2001-2002. Samples
653
are grouped according to type of infection in primary and secondary infections by date of sample
654
collection (See Table 1).
29
655
Figure 5. Multidimensional Scaling using Root Mean Square Deviation (RMSD) values
656
calculated using significant minority variants (> 1 %) for data set a (red dots) and b (blue dots).
657
Numbers represent the 20 studied samples ordered by collection time as indicated in Table 1.
658
Samples 4 and 5 that correspond to a different lineage were excluded.
659
Figure 6. Dendrogram clustering of Cuban isolates collected at different point times during the
660
2001-2002 epidemic using an RMSD-based distance matrix including data sets a and b. Samples
661
that correspond to a different lineage were excluded.
30
Table 1. Data of DENV-3 samples collected during the 2001-2002 epidemic Sample number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Sample code Cuba_15_01 Cuba_26_01 Cuba_73_01 Cuba_118_01 Cuba_167_01 Cuba_463_01 Cuba_492_01 Cuba_504_01 Cuba_513_01 Cuba_523_01 Cuba_546_01 Cuba_547_01 Cuba_553_01 Cuba_557_01 Cuba_558_01 Cuba_568_01 Cuba_580_01 Cuba_16_02 Cuba_11_02 Cuba_17_02 Cuba_20_02 Cuba_21_02
a
b
Date of sample collection 5-Jul-01 15-Jul-01 22-Jul-01 3-Aug-01 15-Aug-01 23-Sep-01 28-Sep-01 1-Oct-01 2-Oct-01 5-Oct-01 15-Oct-01 15-Oct-01 21-Oct-01 22-Oct-01 21-Oct-01 8-Nov-01 22-Nov-01 6-Jan-02 15-Jan-02 20-Jan-02 20-Jan-02 21-Jan-02 c
Days of illness 1 2 2 2 2 2 2 3 3 2 2 2 2 2 4 2 4 3 2 2 2 1
Municipality Playa La Lisa Playa Arroyo Naranjo Arroyo Naranjo Playa Playa Playa Playa Playa Marianao Arroyo Naranjo Marianao Marianao Marianao La Lisa La Lisa Habana Vieja Plaza Centro Habana Centro Habana Centro Habana
d
WHO, 1997; WHO/TDR, 2009; required hospitalization; fatal case
DF: Dengue fever; DHF/DSS: Dengue Hemorrhagic Fever/Dengue Shock Syndrome - WS: Without Warning Signs; +WS: With Warning Signs Samples 13 and 15 correspond to the same patient
Clinical Classification a DF DF DF DF DF DHF/DSS c DF c DF c DF c DF c DF c DF c DF c DF c DF c DF DHF/DSS c,d DF DF c DF DF DF
Clinical Classification b dengue - WS dengue - WS dengue - WS dengue - WS dengue - WS severe dengue dengue + WS dengue + WS dengue + WS dengue + WS dengue + WS dengue + WS dengue + WS dengue + WS dengue + WS dengue - WS severe dengue d dengue - WS dengue + WS dengue - WS dengue - WS dengue - WS
Type of infection secondary secondary secondary primary primary secondary secondary secondary primary primary secondary primary secondary primary secondary secondary secondary primary secondary secondary secondary primary
Table 2. Increasing Clinical Severity During the DENV-3 Cuban Epidemic, 2001-2002
Epidemiological Weeks Months Confirmed dengue cases DHF/DSS cases Deaths Proportion of dengue cases with DHF/DSS (%)a Odds Ratio
22-32 Jun-Jul-Aug 187 0 0
33-36 Aug-Sep 534 1 0
37-40 Sep-Oct 1857 6 0
41-44 Oct-Nov 3185 7 2
45-48 Nov-Dec 2647 23 0
49-52 Dec-Jan 2356 32 1
0 -
0.2 1
0.3 1.72
0.2 1.17
0.9 4.64
1.4 7.25
Chi-square for linear trend from epidemiological weeks 33-36 to 49-52 p1%) common for set a and b for primary infected cases taking as reference the first isolate obtained during the Cuban epidemic, 2001-2002 sample
feature
position
reference
set a
aa
change
frequency (%)
set b
aa
change
9
-
-
-
-
-
-
-
-
-
-
-
10
NS1
3424
C
T
344
synonymous>N
1,145
T
344
synonymous>N
1,037
12
14
frequency (%)
NS1
2887
C
T
165
synonymous>T
1,079
T
165
synonymous>T
0.966
NS5
9803
C
T
754
synonymous>L
4,539
T
754
synonymous>L
4,371
NS5
7866
C
T
108
P>L
1,538
T
108
P>L
1,536 6,398
NS1
2924
C
T
178
synonymous>L
5,312
T
178
synonymous>L
NS3
5371
C
T
293
synonymous>A
1,086
T
293
synonymous>A
0.134
18
3UTR
10369
G
T
*
*
1,641
T
*
*
2,097
22
-
-
-
-
-
-
-
-
-
-
-
aa: amino acid - indicates that minority variants are represented 1%) common for set a and b present in two samples corresponding to the same patient at day 2 and 4 after fever onset taking as reference the first isolate obtained during the Cuban epidemic, 2001-2002
Sample code feature position coverage reference set a Cuba_553_2001 NS5 9142 30246 C T NS3 5371 8024 C T NS5 9985 30249 C T NS4B 7012 16819 G A NS5 7975 19551 G A NS5 7744 18047 G A NS3 5326 8654 C T NS4B 7366 15691 C T NS2A 3500 12713 C T Cuba_558_2001 NS5 9142 13986 C T NS3 5371 18626 C T NS2B 4315 27529 C T E 1510 24416 G A NS5 9092 12493 A G PrM 442* 19513 G T NS1 3373 18492 C T E 2235 30874 C T
aa 533 293 814 71 144 67 278 189 18 533 293 71 199 517 9 327 441
change synonymous>D synonymous>A synonymous>N synonymous>Q synonymous>L synonymous>E synonymous>N synonymous>A L>F synonymous>D synonymous>A synonymous>S M>I I>V E>D synonymous>D A>V
frequency (%) 99.288 96.832 3.203 1.723 1.511 1.472 1.386 1.259 0.983 98.963 97.515 3.495 2.366 1.579 1.114 1.096 0.914
Positions highlighted in bold indicate that this variant became predominant aa: amino acid *Variant A (synonymous>E) at position PrM 442 was detected in sample Cuba_553_2001 at frequency = 0.1% (day 2)
set b T T T A A A T T T T T T A G T T T
aa 533 293 814 71 144 67 278 189 18 533 293 71 199 517 9 327 441
change synonymous>D synonymous>A synonymous>N synonymous>Q synonymous>L synonymous>E synonymous>N synonymous>A L>F synonymous>D synonymous>A synonymous>S M>I I>V E>D synonymous>D A>V
frequency (%) 98.715 99.878 3.15 1.678 1.686 1.484 1.003 0.921 1.474 97.096 99.666 5.037 2.212 1.941 1.283 1.032 0.902