Gene, 103 (1991) 269-274 t0 1991 Elsevier Science Publishers
GENE
B.V. All rights reserved.
269
0378-I 119191/$03.50
05003
Cloning
and characterization
(Alcohol
dehydrogenase;
of the human ADH4
gene family;
intronjexon
gene
arrangement;
promoter
region;
regulatory
elements;
transcription
start
point)
Hedvig von Bahr-LindstrGm *, Hans JGrnvall and Jan-Olov Hiiiig Department of Chemistry I, Karolinska Institutet. S-104 01 Stockholm (Sweden) Received by J.K.C. Knowles: Revised: I1 February 1991 Accepted: 12 March 1991
5 November
1990
SUMMARY
Human alcohol dehydrogenase (ADH) constitutes a set of isozymes and enzymes with different tissue and substrate specificities. The subunits are coded for by at least five gene loci, ADHl-ADH5. We now report the cloning and analysis of the human ADH4 gene coding for the class-11 ADH with rr-subunits. The gene spans a region of 21 kb and is divided into nine exons and eight introns. The arrangement is the same as for all analyzed mammalian class-1 genes, but the region covered is 50”,, larger than that in the human class-I genes. The nucleotide (nt) sequences of the exons, exon/intron boundaries and 5’- and 3’-untranslated regions were determined. The transcription start point (tsp) of the ADH4 gene was defined by primer extension and localized to a position 61 nt upstream from the ATG start codon. A TATA box and a CAAT element were identified by homology to consensus sequences for tsp. No DNA structures homologous to the glucocorticoidresponsive elements (GRE) present in the ADH2 gene were found in the upstream region ofthe ADH4 gene, but two structures with a 70”, identity to the GRE consensus sequence were found at nonhomologous locations. The difference and the overall low degree ofidentity, 41’:,, ofthe upstream regions suggest different regulatory mechanisms for the class-l and class-11 genes.
INTRODUCTION
Human ADH exists as a large number of dimeric isozymes and enzymes in various tissues. Three classes of the enzyme can be distinguished by substrate specificities and
Correspondence Karolinska
10: Dr.
Institutet,
Tel. (46-8)7287740; * Present
address:
Stockholm
(Sweden)
J.-O.
H(iiig,
Department
S-104 01 Stockholm
of Chemistry
I,
(Sweden)
Fax (46-8)338453. Kabi
Pharmacia,
Peptide
Hormones,
S-112
87
Tel. (46.8)6958119. Abbreviations:
aa, amino acid(s); ADH,
alcohol dehydrogenase;
ADH2, ADH3, ADH4 and ADHS, human genes encoding
ADHI,
r-, p-, y-, K- and
X-subunits of ADH, respectively; AMV, avian myeloblastosis virus; bp, base pair(s); GRE, glucocorticoid-responsive element(s); kb, kilobase or 1000 bp; nt, nucleotide(s); scription
start point(s).
oligo, oligodeoxyribonucleotide:
rsp, tran-
other properties (Vallee and Bazzonc. 19X3; Jiirnvall et al., 1989). In class-I, the classical liver enzyme in ethanol metabolism, the dimers consist of three different subunit types a(, p and ;I. These subunits are encoded by separate genes. ADHl-3 (Smith et al., 1973). The class-11 ADH with rr-subunits is encoded by ADH4. and the class-111 ADH with X-subunits, recently found to bc identical to glutathione-dependent formaldehyde dehydrogenase (Koivusalo et al., 1989) is encoded byADH5 (Smith, 1986). ADH2 and ADH3 are polymorphic, giving rise to subunits /?J, , 8, &, and yl, y2, respectively. The aa and cDNA sequences have been determined for all human ADH subunits, but genomic structures are only known for the different class-1 ADHs (Duester et al., 1986a; Matsuo and Yokoyama et al., 1989). For the class-III ADH, a pseudogene has recently been identified which lacks a promoter region and all introns (Matsuo and Yokoyama, 1990). The only other mammalian ADH genomic structures known arc those of
class-1 from mouse and rat (Zhang et al.. 1987: Crabb et al.,
RESULTS
AND
DISCUSSION
1989). In all mammalian ADH genomic structures analysed, the coding parts arc interrupted by eight introns and span a region of approx. lo-14 kb, suggesting that this overall arrangement is typical of the class-1 genes at large, independent of species. However, the relationships for the class-11 and class-111 genes representing different enzymes with other specificities and propertics (Vallec and Bazzone, 1983; Jiirnvall et al., 1989; Kaiser et al., 1989). are unknown. As with other proteins, some correlations between cuon/intron borders and domain borders have been detected in ADH (Duester ct al.. 1986b). The human ADHl-3 and ADHS genes have all been localized to chromosomc 4. region q21-25 (Smith, 1986). Recently, the class-11 gene, ADH4, was mapped to 4q22 (McPhearson et al., 1989) which places all human ADH genes in a cluster on chromosome 4.
t\vo (8.6 and 0.9 kb) within the coding region, one (5 kb) covering the promoter region, and one (2 kb) upstream from the gene. The insert of AADH4 : 2 (Fig. 1) is comprised of three fragments, 8.6. 3.6 and 3.5 kb. covering the coding and 3’-untranslated regions. The sum of the fragments identified from each clone is within the limits of the insert size used in the construction of the library and no clone covers the entire ADH4 gene. All EcoRI fragments from the two clones were subcloned into pEMBL8 vectors for furthcr characterization.
Structurally, the class-11 ADH constitutes a separate enzyme (Jiirnvall et al., 19X7), with kinetic properties and a substrate pocket that differs largely from those of the other classes (Eklund et al.. 1990). It is also the least extensively studied of the mammalian ADHs. The aim of the present study was to isolate and to determine the intronjexon arrangement. the tsp. and the upstream sequence of the ADH4 gene coding for the class-11 ADH. The results will be compared to the class-1 ADH genomic structures. introniexon arrangement, and regulatory elements.
I8
XADH~
(b) Introns
and exons
The intron:‘exon structure of the ADH4 gent wxs detcrmined by restriction mapping and sequence analysis of the
I '3
55
2i
20
09
of the human ADH4 gene Using three probes from a previously isolated class-11 cDNA clone (Haiig et al.. 1987) a human genomic library from fetal liver (Lawn et al., 1978) was screened. Two independent clones wcrc isolated: /IADH4 : 1 covering the S’untranslated region and part of the coding region. and iADH4 : 2 covering the 3’ end of the coding region and the 3’-untranslated region. Clone /ZADH4 : 2 was isolated in duplicate from two separate screenings with two different probes covering the 3’ part. Together, the isolated clones covered the entire ADH4 gene (Fig. 1). The insert of AADH4 : 1 (Fig. 1) is comprised of four EcoRI fragments, (a) Cloning
05
Lb
kb
t XADti4-2
Fig. I. Structure
and restriction
map of the human ADH4 gcnc. Exonsxc indicated
by blackened
boxes. The sizes of the mtrons
rclatiw positions of LADH4 : I and iADH4 : 2 arc sho\~n I.ADH4 : 1 stretches about 6 kb upstream is prrscntcd. Restriction sites used for scqucnce analysis and size determination of introns are indicated. iCharon4.4
(Lawn et al., 1978) was screened
using three restriction
arc given in kb. The
from what is shown, while the entire iADH4 : 2 A human fetal genomic library in bacteriophage
enzyme fragments from a class-11 cDNA clone (Hiiiig et al., 1987). .4 51 I-bp
RwI
fragment containing part of the 5’.untranslated region and the 5’ end of the coding region, a 398.bp Rwl fragment from the middle part of the coding region. and a 281&p A4hoII fragment from the 3’.untranslated region were purified from polyacrylamide gels by diffusion elution followed by repented ethanol
precipitation.
The purified
fragments
were lahelcd
with
[ r-“P]dATP
(Amersham)
by nick-translation
(Maniatis
et al., 19X2). The library nas
screcncd, and phage DN.4 from purified plaques was isolated as described (Maniatis et al., 1982). Inserts were liberated byEcoR1 digestion and suhcloncd into pEMBLX vectors. Ohgos unique to each cxon. \+wc synthesircd nith an Applied Biosystems 38 1A instrument and used as probes in blot hybridization (Southern.
1975) of EcoRI digests of the recombinant
phagc i DNA. 5’-end
lahelling of oligos and hybridizations
were performed
et al.. 19X7). All subclones wcrc digested with restrictIon cnqmes (Int. Riotechnologies Inc.) and analyzed by Southern-blot cxon-specific ohgos after separation of fragments by I I’,, agarohc gel electrophoresis. to estimate the intron sizes.
as described
hybridizations
(Hi@
using the
-147
ATMAAGAAA
74
GAGTTTGAAG CTTTCTTAAC TCAGAAAGAA ACTTCCAACA CAGTTTCCCA AAGAAAAATG GGCACCAAGG Met GlyThrLysG
acttttgtgttccatcacag_GCAGCCCTGGACTGTAC~CCGCAGGCTGGGGATCATGTACTTTCATT 276 LysAlaAlaI,euAspCysThrThrAlaGlyTrpGlySerCysThrPheIle
TGTGGATTTTGCCCTTGACTGTGCAGGTGGATCTGAAACCATGgtatgtatattttgtt~ttg.......3.5.kb yVa1AspPheAlaLeuAspCysAlaGlyGlySerGluThrMet 275
GGAGCCACTGACTGCCTCATCCTAGAGACTTACAT~CCGATCCAGGAAGTTATCATTG~TTGACCMGGGAGGG~~GG GlyAlaThrAspCysLeuAsnProArgAspLeuHisLysProIleGlnGluValIleIleGluLeuThrLysGlyGl
ttcctgcttgcagGTCACCCCTGGTTCGACTTGTGTTGG 189 ValThrProGlySerThrCysAlaValPheGlyLeuGlyGlyValGlyLeuSerAlaValMetG
GGCTATGGGGCTGCAATCAACAATGCCAAGgtaaatggttaaacaccaat.......5.5.kb.......actaatt GlyTyrGlyAlaAlaIleAsnAsnAlaLys 188
CAGATATCMTCTTGCCATAGATGATGATGCAAATTTTAGAGAGAGTTTGTCTGCTTGGATGTGGGTTTTCMCT frAspIleAsnLeuAlaLysIleAspAspAspAlaAsnLeuGluArgvalCysLeuLeuGlyCysGlyPheSerThr
CAGCAGGTTTACCTGCAAAGGAAAACCAGTTTACCATTCCCAGTACATTCTCTCAGTACACTGTGGTGT rSerArgPheThrCysLysGlyLysProValTyrHiSPhePheGlyThrSerThrPheSerGlnTyrThrValValS
115 rAsnLeuLysSerProAlaSerAspGlnGlnLeuMetGluspLysTh
. 2.7.kb.......tacattttctgagTMTCTCAAAAGTCCTGCTAGTGATCAACAACTRATGC
AG~TGCMGTTTTGTCTGAGTCCAAATTTTCAGgtaagCaCtCtaCaCtgttt.... ArgLysCysLysPheCySLeuSerProLeuThrAsnLeuCysGlyLysIleSe 115
cgaa......2.6.kb......catatggcctttctttctagGTGACAAAGTAATTCCACTTCCACTTTATGCACCTCTATGT 86 lyAspLysValIleProLeuTyrAlaProLeuCys
GGCCATGAGGCTGCAGGTATTGTGG4AAGTATTGGGCCAGGAGTGACC~CGTC-CCAG~attttattttattC C,lyHisG1IL4laAlaGlyIleValGluSesIleGlyProGlyValThrAsnValLysProG 86
TTGCTACCTCCCTGTGCCATACTGATGCCAGTGTTATCGATTCT~TTTGAGGGCCTAGCTTTCCCAGTGATCGTT leAlaThrSerLeuCySHisThrAspAlaSerValIleAspSerLysPheGluGlyL~~laPheProValIleVal
AGTTCGCATTCAGgtaagtggagactacccctt.......0.9.W,.......aaggatatgattgcctgtagATCA WalArgIleGln 39 40 1le1
GCAAAgtaagcaagtaagCtgtatc.......1.8.kb.......cttttctttctctctttcagGTTATTAAArGC lyLys 5 6 ValIleLysCys AAAGCAGCCATCGCCTGGGMGCAGGCRRGCCCCCTTTGCATTG~GAGGTTGMGTAGCTCCCCCC~GGCTCAT~ LysAlaAlaIleAlaTrpGluAlaGlyLysProLeuCysIleGluGluValGluValAlaPloProProLysAlaHisG1
4
AGGCCGGCA-r GGCTGTGAAT TACAGCAACA AP.GGAGX%M GGMGTGATT
GGAGAATTAA GCAACATGAA TGGTATTATT CAAAGACAGC TCATTATAGG ACACGGAACT CCCTGGCTAG
AGTACAAATG ATGTGGTMG -77
-217
AAAGAATTTA AAAAATCTTG GAGCTCACTG GGAGCAATGG GGTTGCAGCT GAAGT-
-287
GRTTATCAGG GTTGTGAAGG AGAAGAACAG GTAAGTTAAA TGGGCATTCT GAGGAGTAGA AATTTCCTTT
-357
ACTTCATAAT TATTTGTTAA TTCATCTTTA TATGTTTAAT GGGCTTTTCT CTATTATTTT ATATTTTTCA
A .RUGCTTGCT AGATTAACTA TTGATACACA AGCTTAAATA GGTAGTAAAC CACTAAATAT GCAAGGAAGT B
AGATGCCAG
ATAACTAGTG TTTATG~TA
GMTAAGACC
TGGTATTTGA TAGCACrAACA GGGAGACTAT AGTCAACAGC AATTTAA'W';
sequence
indicates
Nuclcotide
determination.
(Amersham)
of introns,
Sequence
5’.
under accession
Nos
will appear
regions. in the EMBL, GenBank
part,
the [ZSS]dATP and DDBJ
the sequence
of the coding
X5641 l-X56419.
were also on single-stranded
fragments
were used throughout
3’-untranslated
for both strands and (Pharmacia) data reported Databases
The sequence
the
nt sequence and and T7 DNA polymerase
end
This gave the complete
determination
Some restriction
and
sequences. templates.
in overlapping
beginning
resulting
into M 13mp18/19 vectors for dideoxy sequence
subcloned
1985) and exon-
method
of sequences by the dideoxy (Chen and Seeburg,
were determined plasmids
to GRE, see Fig. 4. Sequences
TATA box G. An arrow (nt 1920)
The putative
The The sizes
are shown.
(see Fig. I), the
point of the cDNA. For positions
as an outlined
from the cDNA sequence. The tsp is marked
et al., 1977) using alkali-denatured specific primers
(Sangcr
regions
gene. All nine exons
to the coding region is shown below the nt sequence.
the position of the polyadenylation with homology
ADH4
and the 5’- and 3’.untranslated
of the human
are noted, as are numbers
corresponding
of the introns,
and CAAT element are underlined.
of the introns
aa sequence
parts
Fig. 2. Nucleotide proximal
CATCTCATGT ACCCCATCCC ATATAAATGT TTTAAATAAA AATAAAGAAA AGATTTCCAC
AA'TAAAAAGA TTTTTACTGG AAAAAATCAC ATTATTT
TATATACACC TAATACCCAC AAAACTTMA
ACCCCATATT ACATGATGTG ATTATTACAC ATTGCATGCC TATATCAMA
+ TATATTTTAA TATGACTAAA AGAGTATAAT GGATTGTTTG TAACACAAAT AAATGCTTGA GGAGATGGAG
T