Gene, 122 (1992) 377-380 0 1992 Elsevier Science Publishers
GENE
377
0378-1119/92/$05.00
06830
A pseudogene (Gene;
B.V. All rights reserved.
evolution;
for human glutathione
UGA:
Alan M. Diamond
sequencing;
cloning;
peroxidase
retrotransposition)
a, Rebecca Cruz a, Craig Bencsics a and Dolph Hatfield b
aDepartment of Radiation and Cellular Oncology. University of Chicago, Chicago, IL 60637, USA: and h Laboratory ofExperimentalCarcinogenesis, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA. Tel. (301) 496-9263 Received by Y. Sakaki:
1 May 1992: Revised/Accepted:
20 July/ 21 July 1992; Received
at publishers:
3 September
1992
SUMMARY
Glutathione peroxidases (GPx) serve a bioprotective function in the reduction of peroxides to less toxic substances. Both cellular and secreted forms of the protein have been reported, as well a number of distinct cDNA sequences. Previous efforts have described three distinct loci on human chromosomes 3, 21 and X which hybridize to a GPX cDNA and these authors have speculated that only the chromosome 3 locus encodes a functional GPX gene. This conclusion was based on mapping studies showing a precise deletion of intron sequences in the GPX loci on chromosomes 21 and X despite strong conservation among these sequences in both the coding and 3’-untranslated regions. To pursue this issue, we have isolated the chromosome 21 GPX locus by molecular cloning and determined its nucleotide sequence. Consistent with the expectations of McBride et al. [Biofactors 4 (1988) 285-2921, the sequence does reveal a highly conserved processed pseudogene. It is suggested that a retrotransposed copy of the GPX gene integrated into chromosome 21 and may have maintained activity prior to the accumulation of inactivating mutations.
INTRODUCTION
Glutathione peroxidase (GPx; EC 1.11.1.9) is a selenoprotein which functions in protecting cells against oxidative damage. Selenium exists as a selenocysteine moiety within protein (Cone et al., 1976) and it occurs at the active site of GPx (Forstrom et al., 1978). Selenocysteine is donated to the growing polypeptide chain in response to certain UGA codons by a specific selenocysteine-inserting tRNA (Hatfield et al., 1990; Bock et al., 1991). UGA codes for selenocysteine in mammalian GPx (Chambers et al., 1986; Mullenbach et al., 19X8), selenoprotein P (Hill et al., 1991)
Correspondence to: Dr. A.M. Diamond, University of Chicago, MC0085, Chicago, IL 60637, USA. Tel. (312) 702-9193; Fax (312) 702-1968. Abbreviations:
bp, base pair(s): GPx, glutathione
encoding GPx; kb, kilobase reading frame.
peroxidase;
or 1000 bp; nt, nucleotide(s);
GPX, gene ORF. open
and type-1 iodothyronine 5’ deiodinase (Berry et al., 1991), as well as a number of prokaryotic proteins (Stadtman, 1991). At least two distinct but related forms of GPx have been described, a cellular and secreted plasma form (Takahashi et al., 1987). The peptide sequence for these proteins are also different from that predicted for two GPxrelated cDNAs isolated from human liver recombinant libraries (Dunn et al., 1989; Akasaka et al., 1990). Furthermore, McBride et al. (1988) detected three distinct loci in the human genome which hybridize to a GPX cDNA probe. These loci map to chromosomes 3,21 and X and only the locus on chromosome 3 hybridizes to a GPX intron probe. It was therefore suggested that the locus on chromosome 3 encodes a functional protein while the other loci encode processed pseudogenes. Moscow et al. (1992) have recently sequenced the gene on chromosome 3 and found it to be functional. Here, we have characterized the GPX hybridizing locus on chromosome 21 and show that it is a GPX pseudogene.
378 EXPERIMENTAL
TABLE
AND DISCUSSION
(a) Molecular cloning chromosome 21
of the GPX-hybridizing
locus on
To isolate genomic clones representing previously identified loci which hybridize to a GPX cDNA probe, a recombinant human ICharon28 phage library was screened by hybridization to a 32P-labelled human GPX cDNA probe (McBride et al., 1988). One million phage were screened and two clones with identical restriction enzyme digest maps were isolated. DNA of one of these clones was prepared for analysis. To map this clone in the region of the GPX homology, recombinant phage DNA was digested with EcaRI, BglII, PstI and XbaI alone and as double digests, electrophoresed in a 1% agarose gel, transferred to GeneScreen Plus hybridization paper by capillary action and probed with radioactively labelled human GPX cDNA. The resulting autoradiograph is shown in Fig. 1. Examination of these data reveals a digestion pattern which is consistent only with the restriction enzyme cleavage for the GPX-hybridizing locus on chromosome 21 (McBride et al.,
123456
78
9
IO
Fig. 1. Southern analysis of GP21. Phage DNA was digested as follows. Lanes: 1, X&I; 2, XbaI+BgZII; 3, BglII; 4, XbaI+Ps& 5, BgZII+PstI; 6, PsrI; 7, EcoRIiXbaI; 8, EcoRI+&$II; 9, EcoRI+PstI: 10, EcoRI. The digest in lane 7 is partial. Markings on the left margin indicate the positions of the n/l, standards obtained from HindIII digest of phage 1 DNA (9.4, 6.6, 4.4, 2.3, 2.0 and 0.56 kb). DNA was electrophoresed in a 1% agarose gel and transferred to GeneScreen Plus hybridization membrane (Du Pont) by capillary action. Human GPX cDNA ~ullenbach et al., 1987) was labelled by the random primer method (Feinberg and Vogelstein, 1983) and then used to probe the GeneScreen Plus membrane employing procedures suggested by the vendor. The resulting autoradiograph was used to determine Table I).
the A&s of the hybridizing
fragments
(see
I
Identification chromosome
of GPX-hybridizing 2 1a
sequences
as the Iocus from human
21
x
3
7.0; 0.9
5.0, 1.2
4.0, 1.0
3.9, 0.5
5.0, 0.6
5.0, 0.9
2.5, 1.0, 0.1
0.6, 0.5, 0.1
EcoRI + BglII
2.3, 0.6
2.1, 0.9
5.0, 2.5
2.9, 0.5
EcaRI + XbaI PSI Psi1 + 3grrr
2.5, 0.8
2.4, 1.3
5.6
6.0
1.5, 0.6 2.5, 2.3
3.3 5.4
3.1 3.4
3.3 3.3
3.1 3.1
Digest(s)
Size(s)
EcoRI EcoRI + PstI
PstI + XbaI Bg/II BglII t XbaI
a Digests and resulting
2.9, 2.3 2.9, 2.3 1.5
5.2, 0.6 3.7
4.0
2.9
sizes (in kb) of hybridizing
mined from the data shown in Fig. 1. Expected GPX-hybridizing the restricion
loci on chromosomes
1.7, 0.5 1.1, 0.6 0.9, 0.6
fragments
were deter-
sizes expressed
as kb of
21, 3 and X were determined
maps for these loci presented
in McBride
from
et al. (1988).
1988). As can be seen in Table I, only the loci from chromosomes 3 and 21 would be expected to produce a BglII fragment of approx. 3.0 kb, but only the locus on chromosome 21 lacks an internal PstI site. Similarly, the lack of an XbaI site between the Eg!II sites which iIank the hybridizing region of the clone is consistent with a chromosome 21 origin and elimates both 3 and X. Finally, the presence of only a single EcoRI site between these same BglII sites is also consistent with this clone being derived from chromosome 21. Based on this analysis, we conclude that this clone was derived from chromosome 21 and refer to it as GP21. (b) Nueleotide sequence of GP21 From the mapping data presented above, it was apparent that all the GPX hybridizing sequences in GP21 were present on a 3.3-kb BglII fragment which was subsequently subcloned into the BumHI site .of plJC18 in both orientations. It was also apparent that the hybridizing sequences contained an intern&&o RI -site which was eerate deletion clones. Sequencing was therefore performed with these deletion clones using universal plasmid sequencing primers and sequencing across the EcoRI site. As the sequence was generated, oligo primers were prepared to extend the analysis such that data were obtained from alternate strands. The nt sequence is shown in Fig. 2. (c) GP21 is a processed pseudogene The nt sequence of GP21 is compared to that of a human GPX cDNA (Fig. 2). The two sequences are very similar: 94% in the coding sequences and 88 “/;, in the portion of the cDNA 3’ to the translational stop codon. High similarity also occurs for 47 nt 5’ of the start codon of the cDNA sequence at which point there no longer are any significantly similar sequences in the 5’ direction. The sequence
379
GE21
TCTAGATACA
ACAAGATTTA
CAAAGTATTT
TTTTTGACAG
GACAGTTGCG
CCCTGTGTGC
GACAATTGCG *
CCATGTGTGC **
TGCTA
ACAG
wx
*
50
TTACTATTAT
TTTTGTTATT
TGTTTGTTCG
GGGCGCTCCC
CTAGCTTCTC
TGCTTGTTCG *
GGGCGCTCCG
CTGGCTTCTTG *
TGCTGGGCTA
GCGGCGGCGG
CTGCCCAGTC
TGCTCGGCTA
GCGGCGGCGG
CGGCCCAGTC
loo
Iso
GGTGTAAGCC
TTTTCCGCGC
GCCCGCTGGC
CGGCGGGGAG
CCTGTGAGCC
GGTGTATGCC
TTCTCGGCGC
GCCCGCTGGC *
CGGCGGGGAG
CCTGTGAGCC
TGGGCTCCCT
GCGGGGCAAG
GGACTACTTA
TCGAGAATGT
GGCGTCCCTC
TGGGCTCCCT *
GCGGGGCAAG
GTACTACTTA
TCGAGAATGT
GGCGTCCCTC *
GGAGGCACCA
CGGTCCGGGA
CTACACCCAG
ATGAACGAGC
CGCAGCGGCG
TGAGGCACCA *
CGGTCCGGGA
CTACACCCAG
ATGAACGAGC
TGCAGCGGCG *
CCTCGG
l
CCCCGGGGCCTGGT
GGTGCTTGGC
TTCCCGTGCA
ACCAGTCTGG
CCTCGGACCCCGGGGCCTGGT
GGTGCTCGGC
TTCCCGTGCA *
ACCAGTTTGG
GCATCAGGAG
AACGCCAAGA
ACGAAGAGAT
TCTGAATTCC
CTCAAGTACG
GCATCAGGAG **
AACGCCAAGA
ACGAAGAGAT *
TCAGAATTCC
CTCAAGTACG *
2oo
250
3oo
35o
4oo
TCCAACCTGG
TGGTGGGTTC
GAGCCCAGCT
TCATGCTCTT
GGAGAAGTGC
TCCGGCCTGG
TGGTGGGTTC
GAGCCCAACT
TCATGCTCTT * *
CGAGAAGTGC *
GAGGTGAACG GAGGTGAACG *t *
GTGCGGGGGC GTGCGGGGGC
GCACCCTCTC GCACCCTCTC **
TCCGCCTTTC TTCGCCTTCC *
TGCGGGACGC TGCGGGAGGC
5oo
CG GCCAGCCCCCAGCGACGA
CGCCACTGAG
CTCATGACCG
ACCCCAAGCT
55o
CCTGCCAGCTCCCAGCGACGA
CGCCACCGCG
CTTATGACCG
ACCCCAAGCT
CATCACCTGG
TCTCCGGTGT
GTCGCAACGA
TGTTGCCTGG
AACTTCTTTG
CATCACCTGG
TCTCCGGTGT
GTCGCAACGA
TGTTGCCTGG
AAC
** *
***
*
*
AGAAGTTCCT
GGTGGGCCCT
GACGGTGTGC
CTGTATGCAG
GTATAGCTGC
GGTGGGCCCT
GACGGTGTW
CCCTACGCAG
GTACAGCCGC
CGCTTCCAGA
CCATTGACAT
CGAGCCTGAC
ATCGAAGCCC
TGCTGTCTCA
CGCTTCCAGA
CCATTGACAT
CGAGCCTGAC
ATCGAAGCCC
TGCTGTCTCA
AGGGCCCAGA
TGTGCCTAGG
GCGCCCCTCC
TACCCCGACT
GCTTGGCAGT
AGGGCCCAGC
TGTGCCTAGG
GCGCCCCTCC
TACCCCGGCT
GCTTGGCAGT
TGCAGCGCTG
CTCTCT
TGCAGTGCTG
CTGTCTCGGGGGGGTTTTCATCTATGAGGGTG
TTTCCTCTAA
ACCTGCAAGG
AGGAACACCTGATCTTGCAGA
AAATACCCCC
TCGAGATGGG
AAATACCACC
TCGAGATGGG
ACCTACGAGGGAGGAACACCTGATCTTACAGA TGTCG
TTCATCCGA
TCTCTGCCA
6oo
TTTG
AGAAGTTCCT
GGGGGGTTTTCATCTATGAGGGTG
450
TTTCCTCTAA
ACCAGGGCGAGTTTCCCCACTAA
650
loo
750
800
851
a91
TGCTGGTCCTGTTGATCCCAGTCTCTGCCAGACCRAGGCG TAAAGTGC
GGGTAG
TAU,GTGCCGGGT
Fig. 2. Comparison cDNA
for human
AGC934
GTCAGCAAAAAAAAAAAAA
of the nt sequence determined GPX. The nt sequence
for GP21 to that of the
of GPZl (GenBank
accession
No. M93083) was aligned to the reported sequence of human GPX cDNA as reported (Sukenaga et al., 1987). The start and stop codons as well as the selenocysteine-encoding
TGA
are shown
in boldtype.
A putative
‘TATA’ box and polyadenylation signal are underlined. Differences in the coding portion of these sequences are indicated by an asterisk above the mismatch.
Cloning
and sequencing
busing both the Sequenaseversion Cleveland,
were as described 2.0 sequencing
OH) or a Model 373A automated
terns) such that both strands
in sections
mology between
the coding regions
of GP21 and GPX is
higher than that observed for other pseudogenes. For example, the pseudogenes for human a-globin and rabbit fl-globin have retained 76% and 82% sequence identity in their coding regions, respectively (Proudfoot and Maniatis, 1980; Lacy and Maniatis, 1980). Furthermore, there is a 3-nt in-frame addition in GP21 as compared to the GPX cDNA which would be expected to maintain an ORF. This retention of the coding potential of the cDNA might be indicative of evolutionary pressure to preserve the coding capacity of GP21. It is interesting to note that the UGA codon which specifies selenocysteine in the ORF of the
GPX cDNA has been changed to a glycine codon, GGA. It is possible that there was evolutionary pressure to limit the amount of mRNA with an in-frame UGA codon. If the GPX cDNA was transcribed following its reintroduction into the genome, then it would be predicted that integration needed to occur in the vicinity of nt sequences which could serve as a promoter. Such a sequence, displaying similarity to a ‘TATA box’ (Maniatis et al., 1987), is present at the -93 position of GP21. This is in a region described above which shows no similarity to the comparable region of the fuctional GPX gene on chromosome 3 (Moscow et al., 1992) and therefore was present on chromosome 21 prior to retrotransposition. (d) Conclusions We have isolated by molecular cloning the GPX hybridizing locus on chromosome 2 1. Analysis of the nt sequence reveals a processed pseudogene that has lost its start codon and ORF but maintains strong similarity to the coding GPX sequence present on chromosome 3. Thus, the presence of a poly(A) tract in the exact location of the poly(A) sequence of GP21 mRNA and the abrupt loss of similarity 5’ to the start codon argue strongly that GP21 represents a retrotransposed copy of a GPX mRNA.
a and
kit (U.S. Biochemicals,
sequencer
(Applied Biosys-
ACKNOWLEDGEMENTS
were determined.
of GP21 displays a number of properties which strongly suggests it is a processed pseudogene (Vanin, 1985). It lacks an intervening sequence present in the GPX locus on chromosome 3. In addition, it has lost the start codon, has an in-frame TAA stop codon at position 155 and has two frame-shift deletions at positions 307 and 503. A run of 20 A’s are also seen 3’ to the stop codon at a position corresponding to the polyadenylation site of a precursor GPx mRNA. Collectively, these data are consistent with the reverse transcription of a GPX mRNA and its subsequent introduction into chromosome 21. The 94% sequence ho-
The authors wish to thank Dr. Howard Lieberman for his assistance with DNA sequencing and Dr. G. Mullenbath for the human GPX cDNA probe. This work was supported by NIH grant ROl CA54364 to A.M.D.
REFERENCES Akasaka, M., Mizoguchi, J. and Takahashi, K.: A human cDNA for a novel glutathione peroxidase-related protein. Nucleic Acids Res. 18 (1990) 4619. Berry, M.J., Banu, L. and Larsen, P.R.: Type I iodothyronine deiodinase is a selenocysteine-containing enzyme. Nature 349 (1991) 438-440. Chambers, L., Frampton, J., Goldfarb, P., Affara, N., McBain, W. and
380 Harrison, P.R.: The structure of the mouse glutathione peroxidase gene: the selenocysteine in the active site is encoded by a termination
Maniatis, T., Boodbourn, S. and Fischer, J.A.: Regulation of inducible and tissue specific gene expression. Science 236 (1987) 1237-1245.
codon TGA. EMBO J. 5 (1986) 1221-1227.
McBride,
Cone, J.E., Del Rio, R.M., Davis, J.N. and Stadtman, characterization reductase:
of the selenoprotein
identification
of selenocysteine
ety. Proc. Natl. Acad. Dunn,
D.K.,
human
Howells,
cDNA
selenopeptide, Feinberg,
D.D.,
sequence GPRP.
Richardson,
moi-
B.: A technique
P.S.: A
DNA
to high specific activity. Anal. Bioof the cat-
as selenocysteine.
Bio-
17 (1978) 2639-2644. and function
Crit. Rev. Biochem..
Lee. B.J., Worland
of suppressor
P.J. and Oroszlan,
tRNAs
S.:
in higher eukaryotes
P contains
10 TGA codons
frame. J. Biol. Chem. 266 (1991) 10050-10053. Lacy, E. and Maniatis. T.: The nucleotide sequence pseudogene.
Cell 21 (1980) 545-553.
Moscow,
chromosomes J.A.,
in the open reading of a rabbit p-globin
Morrow,
Structure
tathione
G.T.,
Proudfoot,
to
and Cowan glu-
B.D.,
Bell, G.I. and Hahewell,
peroxidases.
T.: The structure
of a human
to alpha-globin
and function
and evolution reProt. Eng. 2 (1988) alpha-globin
duplication.
of selenocysteine
Cell 21
containing
J. Biol. Chem. 266 (1991) 16257-16260. K., Avissar,
characterization coprotein
G.T.
selenium-dependent
of incorporation
of three glutathione
and its relationship
maps
4 (1988) 285-292.
J. Biol. Chem. 267 (1992) 5949-
A., Irvine,
(1980) 537-544. Stadtman, T.C.: Biosynthesis enzymes.
cytosolic
mechanism
N.J. and Maniatis,
pseudogene
peroxidase
He, R., MulIenbach,
gene (&xl).
Tabrizi,
R.A.: Selenocysteine’s vealed in cDNAs ‘239-246.
C.S.,
Cl. and Hatfield, D.:
glutathione
3,21, and X. Biofactors
of the human
peroxidase
Takahashi:
Mol. Biol. 25 (1990) 71-96.
Hill, K.E.. Lloyd, R.S., Yang, J.-G., Reed, R. and Burk, R.F.: The cDNA for rat selenoprotein
human
5958. Mullenbach,
peroxidase-related
for radiolabelling
peroxidase
W.O., Mitchell, A., Lee, B.J., Mullenbach,
The gene for selenium-dependent
K.H.:
J.P. and Goldfarb,
J.J. and Tappel, A.L.: Identification
Hatfield, D.L., Smith, D.W.E., Structure
as the organoselenium
for a novel glutathione
alytic site of rat liver glutathione chemistry
glycine
Nucleic Acids Res. 17 (1989) 6390.
A.P. and Vogelstein,
J., Zakowski,
of clostridial
Sci. USA 73 (1976) 2659-2663.
restriction endonuclease fragments them. 132 (1983) 6-13. Forstrom,
component
T.C.: Chemical
N., Whitin, J. and Cohen, H.J.: Purification
of human plasma glutathione
peroxidase:
distinct from known cellular enzyme. Arch. Biochem.
phys. 256 (1987) 677-686. Vanin, E.F.: Processed pseudogenes: Rev. Genet.
19 (1985) 253-272.
characteristics
and
a selenoglyBio-
and evolution. Annu.