J. Biochem. 107, 810-816 (1990)
Molecular Cloning of a cDNA Encoding Rat NADH-Cytochrome fh Reductase and the Corresponding Gene1 Shuhei Zenno,'2 Masahira Hattori,* Yoshio Misumi,** Toshitsugu Yubisui,*** and Yoshiyuki Sakaki* 'Research Laboratory for Genetic Information, Kyushu University, Higashi-ku, Fukuoka, Fukuoka 812; "Department of Biochemistry, Faculty of Medicine, Fukuoka University, Jonan-ku, Fukuoka, Fukuoka 814; and '"Department of Biochemistry, Medical College of Oita, Hazama-cho, Oita 879-56 Received for publication, February 23, 1989
Rat cDNA encoding NADH-cytochrome 6^ reductase (b5R) was isolated from a rat liver cDNA library using a human b5R cDNA as a probe. The cDNA was 1,905 nucleotides long, consisting of a 5-terminal untranslated region of 38 nucleotides long, an open reading frame region of 903 nucleotides long encoding 301 amino acid residues, a 3'-terminal untranslated region of 952 nucleotide long, and a poly(A) tail. The amino acid sequence deduced from the cDNA sequence indicated that the rat b5R precursor contained only one extra amino acid (Met) residue at the N terminus, in comparison with the mature form of the enzyme, suggesting that no extra leader peptide is required for translocation of the enzyme to the microsome membrane. Genomic DNA encoding the b5R gene was isolated from rat genomic DNA libraries. The gene was about 17 kb long, and consisted of nine exons and eight introns. The junction between the membrane-binding and catalytic domains of the enzyme was found in the middle of exon 2, suggesting the possibility that the two forms of the enzyme, namely the membrane-bound and soluble forms, are generated through post-translational processing. The possible promoter region of the gene contained no TATA box but four GC box sequences (GGGCGG and CCGCCC), representing potential binding sites for the transcription factor, SP1. The b5R gene seems to have structural characteristics of a house-keeping gene.
Flavoenzymes catalyze a variety of reactions by transferring one or two electrons between chemically diverse donor and acceptor molecules. NADH-cytochrome 65 reductase [EC 1.6.2.2], a component of the microsomal electron transport system, is a representative flavoprotein having flavin adenine- dinucleotide (FAD) as the prosthetic group that catalyzes the reduction of cytochrome 65 with protocheme as the prosthetic group. The enzyme contains a large hydrophilic catalytic domain and a small hydrophobic membrane-binding domain (1). The enzyme exists in two forms, a membrane-bound form composed of both the catalytic and membrane-binding domains, and a soluble form comprising only the catalytic domain. The soluble enzyme form exists in circulating erythrocytes and participates in methemoglobin reduction (2). The membranebound enzyme form is bound to membranes (endoplasmic reticulum, mitochondria, nuclear, and plasma membranes) of somatic cells, and participates in the desaturation and elongation of fatty acids (3, 4), cholesterol biosynthesis (5), and drug metabolism (6). A deficiency of b5R results in methemoglobinemia. In humans, two types of hereditary methemoglobinemia have been reported (7, 8), the erythrocyte type (type I) in which b5R is deficient only in 1 This study was supported in part by a Grant-in-Aid for Scientific Research from the Ministry of Education, Science and Culture of Japan. 1 Present address: Basic Research Laboratories, Corporate Technical Division, Chisso Corporation, 2 Kamariya-cho, Kanazawa-ku, Yokohama, Kanagawa 236.
810
erythrocytes, which produces mild cyanosis, and the generalized type (type II) in which the enzyme is deficient not only in erythrocytes but also in somatic cells, which is associated with mental and neurological disorders. We previously isolated the cDNA of a membrane-bound form of human b5R (9). However, the cDNA was not "full-length," so the structures of the b5R precursor protein and the corresponding gene have remained unknown. To understand the mechanism for the translocation and maturation of the enzyme, and the regulation of gene expression in a variety of tissues, we have attempted to clarify the structures of the precursor protein and the corresponding gene. In this study, we present cloning and sequence analysis data on a nearly full-length cDNA encoding rat b5R and the corresponding gene. The results show that the b5R precursor has only one extra amino acid (Met) residue at the N terminus, in comparison with the mature form of b5R, and requires no extra leader peptide for translocation to the microsome membrane, and also suggest that the two forms of the enzyme are generated through post-translational processing of the protein. This work also demonstrated that the b5R gene has structural characteristics of a housekeeping gene. EXPERIMENTAL PROCEDURES Materials—Restriction endonucleases and other enzymes were obtained from Takara Shuzo (Kyoto) and J. Biochem.
811
Rat Cytochrome 65 Reductase Gene Nippon Gene (Toyama). Radioactive reagents were obtained from Amersham (U.K.) and New England Nuclear (U.S.A.). Nitrocellulose filter (BA85) and nylon membrane filters were products of Schleicher & Schull and Bio-Rad, respectively. Sequencing primers were obtained from Pharmacia LKB. Specific oligonucleotide primers and probes were synthesized using an automated DNA synthesizer (Applied Biosystems, model 380A). Labeling of DNA and Oligonucleotides—DNA fragments were labeled with [ff-"P]dCTP (-3,000 Ci/mmol) by the random priming method (10). Oligonucleotides were labeled at the 5'-OH end with [y- 3 2 P]ATP (~ 5,000 Ci/ mmol) and T4 polynucleotide kinase (11). Screening of Libraries—A rat liver cDNA library previously constructed as described Misumi et al. (12) was screened by the plaque hybridization method (11), using a fragment of human b5R cDNA, pb5R141 (9) as a probe. Hybridization was carried out at 65"C in a mixture of 10 X NET ( l x N E T = 0.15M NaCl, 0.015 M Tris-HCl, pH7.5, andl mMEDTA), 10xDenhardt's solution (lxDenhardt's solution = 0.02% Ficoll, 0.02% polyvinylpyrrolidone, and 0.02% bovine serum albumin), 0.1% SDS (sodium lauryl sulfate), and 100/^g/ml denatured and sonicated herring sperm DNA for 36 h. The filters were washed in 6 X SSC (lxSSC = 0.015M sodium citrate and 0.015 M NaCl) at 55*C twice for 30 min. A genomic DNA library was constructed from a Sau3AI partially digested rat genomic DNA using EMBL3 as a vector. A Haelll/Alul partially digested rat genomic DNA library, using the Charon4A vector, was kindly provided by Dr. N. Fujiyoshi, Kyushu University. The libraries were screened as described previously (13) using the rat b5R cDNA as a probe. Subcloning and DNA Sequencing Analysis—The cloned DNAs in phage vectors were digested with various restriction enzymes and then subcloned into pUCl3 at appropriate restriction sites. The nucleotide sequence was determined by the dideoxy method (14) using denatured plasmid as a template (15). Data Analysis—The DNASIS package developed by Hitachi Software Engineering was used for analyses of DNA sequences, the hydropathy profile (16), and the secondary structure (17). Blot Hybridization Experiments—Southern blot analysis
Fig. 1. Schematic diagram of the structures of the rat b5R cDNA, pRb5R-L, and the human b5R cDNA, pb5R141 (9). Solid, open, and hatched boxes represent the coding, untranslated, and poly(A) regions, respectively. The direction and extent to which sequence determination of the rat cDNA was carried out are indicated by horizontal arrows at the bottom. Fragment A represents the probe used for screening the rat cDNA library, and fragments B, C, and D the probes used for screening the genomic DNA library, and for Southern and Northern blot hybridization experiments. Bl, Ball; E, £eoRI; He, Hindi; Nc, iVcol; P, Ail; Sm, Smal; Su, Stul; Sy, SiyI; and Xh, Xhol. Vol. 107, No. 6, 1990
was performed essentially as described by Maniatis et al. (11). Northern and dot blot analyses, using poly(A)+RNA and total RNA, were performed as described by Hayashida et al. (18). RESULTS
Cloning and Sequence Analysis of Rat Liver b5R cDNA —A rat liver cDNA library constructed according to Misumi et al. (12) was screened using an EcoBI-PstI 0.4 kb fragment of human b5R cDNA, pb5R141 (9), as a probe (Fig. 1). One positive clone was isolated from about 5 X 10s plaques. The cDNA, designated as pRb5R-L, was about 1.9 kb long and had the restriction map shown in Fig. 1. The strategy used to determine the whole nucleotide sequence of the cDNA is presented in Fig. 1, and the results are summarized in Fig. 2. The cDNA was 1,905 nucleotides long and was flanked by a poly(A) tract at the 3' end. The sequence was found to encode a large open reading frame whose sequence was very similar to the sequence of the mature form of human b5R (9). We therefore concluded that it was a rat b5R cDNA. The first ATG codon was found at nucleotide positions 1-3, and this was assumed to be the initiation codon of b5R because the in-frame terminator codon TAG was found at nucleotide positions from —31 to — 33 and also because the sequence including this ATG codon (CCACCATGG) fulfilled the Kozak criterion (29) of CCA(orG)CCATGG comprising the initiation codon. The in-frame termination codon was found at nucleotide positions 904-906 and so it was concluded that the b5R precursor protein consists of 301 amino acid residues. The size of the rat b5R mRNA was estimated by Northern blot analysis. As shown in Fig. 3, rat liver b5R mRNA was about 2 kb long, i.e., a little longer than rat 18S rRNA, suggesting that the cDNA (pRb5R-L) obtained in this work had a size close to the full-length mRNA. The cDNA had a very long 3' untranslated region of 952 nucleotides long and a poly(A) tail of 12 nucleotides long. A curious repetition of "A" every two nucleotides was found in the 3' untranslated region (underlined in Fig. 2), although the significance of the sequence remains unclear. A polyadenylation signal (20), AATAAA (nucleotide positions 1835-1840), was present at 16 nucleotides upstream from the poly(A) tail. The 5' untranslated region of the
(El Xh
(El
1 1
pb5Ri41
1 1
(human)
D
B Bl
_. p ( pKDbK-L '^'
Kb L 0
Su He Bl
M(
0.5
1.0
Nc
Sy
I I
Bl
(E)
I
1.5
2.0
S. Zenno et al.
812 probe E Kozak GG GIG TAG AAC GGT GCC ACC ACT GTC TTC TIC GT.C ACC ATlTTfH; GCC ••• Met oly All
primer C
9 3
V1
T8
GAI GCC TGG GAC TAT AGC CAA GGC TTC GTG AAT GAG GAG ATG ATC AGG Asp All Trp Asp Tyr Ser Gin Gly Phe Vil Asn Glu Glu Net M e Arg
777 259
GTC Vj] ACC Thr GAG Glu
57 19 105 35 153 51
GAC CAT CTT CCA CCT CCT GGG GAG GAG ACA CTG ATA CTG ATG TGT GGA Asp His Leu Pro Pro Pro Gly Glu Glu Thr Leu M e Leu Met Cys Gly
825 275
CCC CCA CCG ATG ATC CAG TTT GCC TGT TTG CCA AAC CTG GAG CGT GTG Pro Pro Pro Met lie Gin Phe All Cys Leu Pro Asn Leu Glu Arg V*l
873 291
GGC CAT CCC AAG GAG CGA TGC TTC ACC TTC TGA TGG CTG GAT GCT GGC
921
ATT ATC AGC CAC GAC ACT CGG CGC TTC CGA TTT GCA CTC CCT TCG CCC lie lie Ser His Asp Thr Arg Arg Phe Arg Phe A W Leu Pro Ser Pro
201 67
CACTCCCAIG CCTGCTGCTC ACGCACTCAC CACAACCACC TTCCACCCCT TCCTTCCCCA
CAG CAC ATC CIG GGC CIT CCT ATC GGC CAG CAC ATC TAC CTC TCC ACC Gin His lie Leu Gly Leu Pro M e Gly Gin Ills M e Tyr Leu Ser Thr
249 83
AGG ATC GAT C-GC AAC TTG GIC ATT CGI CCC TAC ACC CCT GTG TCT AGT Arg M e Asp Gly Asn Leu Val M e Arg Pro Tyr Thr Pro Val Ser Ser
297 9g
GAI GAT GAC AAG GGC TTT GTG GAC TTG GTG GTC AAG GTT TAC TTC AAG Asp Asp Asp Lys Gly Phe Val Asp Leu Val Vil Lys Vjl Tyr Phe Lys
345 115
GAC ACG CAT CCC AAG TTT CCA GCT GGA GGG AAA ATG TCT CAG TAC CTG Asp Thr Ills Pro Lys Phe Pro Ala Gly Gly Lys Met Ser Gin Tyr Leu
393 131
GAA AAC ATG AAT ATT GGA GAC ACC ATT GAA TTC CGG GGC CCC AAT GGG Glu Asn Met Asn M e Gly Asp Thr M e Glu Phe Arg Gly Pro Asn Gly
441 147
CTA CTG GTC TAC CAG GGC AAA GGG AAG TTC GCC ATC CGT GCA GAC AAG Leu Leu Val Tyr Gin Gly Lys Gly Lys Phe Ala M e Arg Ala Asp Lys
489 163
CAG Gin TAC Tyr CTC leu
CTG Leu AGC Ser GAG Glu
AGC Ser CTC Leu AAC Asn
ACG Ihr TTC Phe CCC Pro
TTG Leu ATG Met GAC Asp
AGC Ser AAG LyS ATC lie
CGA Arg CTG Leu AAG Lys
GTG Val TTT Phe TAC Tyr
GTA Val CAG Gin CCT Pro
CTC Leu CGC Arg CTG Leu
TCC Ser TCC Ser CGG Arg
CCG Pro TCA Ser CTC Leu
GIC Val CCG Pro ATC Me
TGG Trp GCC Ala GAC Asp
TTC Phe ATC lie AAG Lys
AAG TCC AAC CCT GTT GTC AGG ACG GTG AAG TCT GTA GGC ATG ATT GCA Lys Ser Asn Pro Vjl Val Arg Thr Val Lys Ser Val Gly Met M e Ala
537 179
GGA GGG ACA GGC AIC ACC CCA ATG CTG CAG GTG ATC CGA GCC GTC TTG Gly Gly Thr Gly M e Thr Pro Met Leu Gin Val M e Arg Ala Val Leu
585 195
AAG GAC CCG AAC GAC CAC ACT GTG TGC TAT CIG CTC TIC GCC AAC CAG Lys Asp Pro Asn Asp Ills Thr Val Cys Tyr Leu Leu Phe Ala Asn Gin
633 211
T7 ICC Ser GAA Glu
GAG Glu CAT His
AAA GAC ATC CTG CtG CGG CCT GAG C I G GAG GAA CTG AGG AAC L y s A s p lie L e u L e u At g P r o G l u L e u G l u G l u L e u A r g A s n TCT I C I C G C TTC A A G C T C T G G T A C A C A G I G GAC A A A G C C C C C S e r S e r A r g P h e L y s L e u I r p T y r T h r Val A s p L y s A l a P r o
681 227 729 243
Gly His Pro Lys Glu Arg Cys Phe Thr Phe * "
301 981
CTACTGTCCT TTACCCTGAC ATATGCCCAC ATCCATGCTG GGGCCTGGGT TCAC-CCTGGC 1041 CTGCCCAGCC CTGGTCATCC AGCTGIACIG GCCCCTGAGG GGCCCCTTTG GGAGCAGGCC 1101 TGTGTATCAG GTGGCTTCTG TTGACCACTT TCTGAATAGG CTTCTGTCTG GTACTAACTG 1161 GCCATTACCA GAGATGGTCC ATGACCACCC CTTTATACAC ACACACATAC ACATACAGAG 1221 ACAGAGAGAC AGAGAGACAG AGMAGACAG AGATAIVtCAG AGAGACAGAT ACAGACATAG 1281 AGACAG^CAG AGACAGAGAG AGAGGAGTCA GAGAGCTGTT AGTACCATGT CTACCCGTTA 1341 CCATGGACCG CAATGTGTAG TAAGGAAATG AAAGGTAAAC AGTTATCTAC AGTCCTAGGA 1401 ACCATGCCTG TCACCTAACC ACCTACTGCC TTAGGCIATG GCCTGGTCTG TGCCCCTAAA 1461 CACTACATTA GACAGIGACA CCCAGAGGTT CTCTTAGGAG GGIGTCTGGC AGTGACAGGC 1521 CAGCCCCCAC TTGCTGGCCA GGAGITCCTT GGGTCAGCAC GIGGACCCTT CCAAAACCCC 1581 ACACACIGCC CCTCCCCTCC ATGATGCCTT GAGCACCCTA AGITGTCAGC CCAACAGAGT 1641 TTGCIAGACC TTGGGGTACC TGGCTTGTTT CTCCATCCCA TACCTCCCTC CCATCTGGTT 1701 CCCAGCCTGG GGGGTTCTGA GCAGAGCCIC TTTTCTCGGA GACAGACCCG GTGCCTGGCG 1761 CTGCCITCAG CAGACAGCAG ACAGCCTCCT GCACACTGGC TITTTTTAGT CATTTATGGG 1821 CAAAAIGAGT TA/l\A[AA.y CTTTGCAAAT CCTGAAAAAA AAAAAA PolyA probe F
1867
Fig. 2. Nucleotide sequence and deduced amino acid sequence of the rat b5R cDNA. The first letter of the initiation codon, ATG, is numbered 1. The deduced amino acid sequence is given under the nucleotide sequence. The stop codon of the large open reading frame and an upstream in-frame stop codon at nucleotide positions from —31 to —33 are indicated by filled circles. A potential polyadenylation signal, AAT AAA, and the Kozak sequence, CCACCATGG, are boxed. The AN (N = C, G, or T) repeated sequence in the 3'-untranslated region is underlined. The positions of the eight introns are indicated byfilledtriangles. The sequences corresponding to the two synthetic oligonucleotide probes, E and F, used for screening the genomic libraries and the sequence complementary to the synthetic oligonucleotide primer, G, used for primer extension analysis are also underlined. Fig. 3. Northern blot hybrid-
(Kb) ization of the rat b5R mRNA. -9.5 -6.7 28S-
-1.3
t
18S-
-2 -2
Samples were denatured with glyoxal and dimethyl sulfoxide, electrophoresed in a 1.1% agarose gel, and then transferred to a Zeta-probe membrane (Bio-Rad). The filter was hybridized with the "P-labeled cDNA. ifindlll-digested ADNA was run as a size marker. Lane 1, rat liver total RNA (5^g); lane 2, rat liver poly(A)+RNA (20 fig); and lane 3, tfindlll-digested A.DNA (size marker).
r cDN A was 38 bases long and contained direct repeats of the 7-base sequence, GCC ACC A (nucleotide positions from —6 to + 1 and from - 2 4 to - 1 8 ) . On comparing the nucleotide sequences for rat and human b5R cDNA (9), the protein coding region showed 86% homology (758/883 residues), and the 3'-noncoding region showed low homology and the AN (N = C, G, or T) repetition was absent from human cDN A (data not shown).
Among the 125 base changes found in the protein-coding region of rat and human b5R cDNA, 33 were found in the first letter of the codon, 12 in the second letter, and 80 in the third letter. Table I shows the codon usage for rat b5R. In comparison with the average pattern of codon usage found in rat by Ikemura (21), an apparent preference for G and C at the third letter of the codon was observed: 98 codons (33%) terminated in G, 121 (40%) in C, 49 (16%) in U, and 34 (11%) in A. Comparative Analysis of the Amino Acid Sequence of Rat b5R—The amino acid composition of the b5R precursor protein is shown in Table I and the molecular weight of the protein was calculated to be 34,173. Alignment of the amino acid sequences for rat, steer (22), and human b5R (9, 23) is shown in Fig. 4. The homology was 89% (267/300 residues) between rat and steer, 87% (261/300 residues) between rat and man, and 91% (274/300 residues) between steer and man. Twenty-six out of 33 substitutions in rat and steer, and 29 out of 39 substitutions in rat and man were substitutions between chemically similar amino acids (shown by thin vertical lines in Fig. 4). A striking feature of the (rat) precursor protein is that it has only one feature of the (rat) precursor protein, Le., it has only one extra amino acid residue at the N-terminus, in comparison with the J. Biochem.
813
Rat Cytochrome 65 Reductase Gene TABLE I. Amino acid composition and codon usage in rat b5R mRNA predicted from the nucleotide sequence. Amino acid Leu
No. of residues 31
Codon UUA UUG CUU
cue CUA CUG Arg
18
CGU CGC CGA CGG AGA AGG
Pro
Gin
25
10
Lys
18
Ala
12
ecu CCC CCA CCG CAA CAG AAA AAG GCU
GCC Val
Gly
21
20
GCA GCG GUU GUC GUA GUG
GGU GGC GGA GGG
No. of usage 0 5 2 8 1 15 3 3 4 4 0 4 8 8 5 4 1 9 4 14 1 8 3 0 2 7 2 10 0 10 4 6
mature protein, suggesting that b5R requires no extra leader peptide for translocation to membranes (endoplasmic reticulum, mitochondria, nuclear, and plasma membranes). A potential glycine-linked myristylation site of Gly-X-X-X-Ser/Thr {24), where X represent any amino acid residue, was found at amino acid positions 2-6, suggesting that rat b5R was myristylated at the N terminus. Several amino acid residues common to all known NAD(P)H-dependent reductases were found to be conserved in the rat enzyme. These included Gly-72, Gly-137, Gly-153, and Gly-176 in the FAD or NADH-binding domains (25), and Cys-284 at the NADH-binding site postulated by Hackett et al. {26). Since the evolutionary divergence between rat and man has been estimated to have occurred about 95 million years ago {27), about 8 million years appear to have been required for a 1% change in the amino acid sequences of the human and rat lineages. Thus, the unit evolutionary period {28) of b5R is 8, which is close to the value of 11 for cytochrome 65 {28). Isolation and Characterization of the Rat b5R Gene—A Sau3Al partially digested rat genomic DNA library was screened using the rat b5R cDNA (pRb5R-L) as a probe. Nine positive clones were isolated from approximately 1X 10* plaques. Further analysis of the nine clones with probe B (Fig. 1) (from the 5' end of the cDNA) and probe C (Fig. 1) (from the 3' end) showed that five of the nine clones hybridized only with probe B, and three of the nine hybridized only with probe C. Only one clone (pE4 in Fig. 5) hybridized with both probes. Further hybridization analysis using oligonucleotide probe E (Fig. 2) for the 5' end of the cDNA and probe F for the 3' end (Fig. 2) showed that four Vol. 107, No. 6, 1990
Amino acid Ser
No. of residues 18
Codon UCU UCC UCA
UCG Thr
14
He
20
Asn
11
Phe
16
Tyr
10
Glu
15
Cys
4
His
8
Asp
18
Met Trp
9 3
AGU AGC ACU ACC ACA ACG AUU AUC AUA AAU AAC
UUU UUC UAU UAC GAA GAG UGU UGC CAU CAC GAU GAC AUG UGG
No. of usage 5 5 1 1 1 5 2 6 3 3 5 14 1 3 8 5 11 2 8 4 11 2 2 4 4 4 14 9 3
of the nine hybridized with probe F but none with probe E. Among these clones, two clones, pE2 and pE4 (Fig. 5), were chosen for further study. To isolate the phage clones possessing the 5' end of the b5R gene, the Sau3AI partially digested genomic DNA library and a HaeUI/Alul partially digested rat genomic DNA library were further screened using probe B, and ten positive clones were obtained. Two clones, pElO and pCl (Fig. 5), which hybridized with probe E, were chosen for further study. Restriction and blotting analyses indicated that all four clones overlapped each other. The restriction maps of the DNAs are presented in Fig. 5. Rat spleen DNA was analyzed by Southern blot hybridization using probe B. Each sample digested with BamHl, Hindlll, and EcoRl gave a single band of ~ 15, ~ 7, and —6.5 kb, respectively (data not shown). The results were consistent with the data for the restriction maps shown in Fig. 5, suggesting that rat b5R is encoded by a single-copy gene. Nucleotide Sequence of the Rat b5R Gene—The nu cleotide sequences of all the exons and their flanking regions of the rat b5R gene were determined by the dideoxy chain-termination method {13) using denatured plasmid templates {14). The sequences of the exons were completely identical with that of the cDNA (pRb5R-L). The overall structure of the gene is shown schematically in Fig. 5. The gene consists of nine exons in interrupted by eight introns. The exons range in size from 73 to 1,122 bp, while the introns range from 168 bp to about 6 kb in length (Table II). The structure of the exon-intron junctions (Table II) was consistent with the consensus junction sequence {29). The highly conserved GT (the donor site)-AG (the acceptor site)
814
S. Zenno et al. extension and Si protection experiments were carried out, but conclusive results were not obtained. Several trials under different conditions resulted in failure, probably because the very high G + C content of the 5' end region prevented efficient primer extension and b5R mRNAtemplate DNA hybridization. The precise initiation site(s) remains to be determined. The nucleotide sequence of the 5'flankingregion is shown in Fig. 6. It contains no obvious TATA (32) or CAAT box (32), which are common elements of a eukaryotic promoter. The region, however, contained four GC box sequences (GGGCGG and CCGCCC) and several GC boxlike sequences (Fig. 6), that are potential binding sites for the transcription factor, SP1 (33). House-keeping genes generally contain GC boxes in the promoter region instead of TATA boxes (34-38). It seemed reasonable therefore to think that the b5R gene is a house-keeping gene and the GC boxes are used as a transcriptional signal. It is interesting to note that the GCCACACCC sequence from - 9 6 1 to — 953 is completely identical with the sequence found in
structure (30) was completely conserved in all of the introns. Since the size of the cDNA (pRb5R-L) obtained in the present study was close to that of the full-length b5R mRNA (Fig. 2), we tentatively put the promoter of the gene in the region flanking the cDNA-coding region. To determine the exact transcription initiation site(s), primer
1
Steer
10 20 30 40 SO 60 70 QGAOl.STLGHWt-SPl.WFLYSL|Kltl-FQRSTPAlTLEI(POIKYPtm.[DKEYISHOTRRFRFAl_PSPEHl
2
Rtt
HG/WXSTlSR*»L.SPVVFVYSLFHKlFQ1?SSPAITlENPDlr.YPLRl.lD>:EllSHDTRRFRFALPSPQHI
11 3
rtJTTUn
I I
I
I
I
I
OGAQlSTLG>«in.FPYWFLYSllHia.FQflSTPAITLESP0IKYPLRLIDREllSHDTRRFPFAI.PSPqHI
10
20
30
40
50
60
/()
80 90 100 110 120 130 140 LGLPVGOHlYLSARIDGKLYIRPrTPVSSOOOWFVDlVlKVYFUlTHPKFPACSrjtSOYLESMElGDTl lGlPIGCmlYLSTRIDGJfl.YIRPYTPYSSOOOI(GFYDimYYF>:OTKPi:FPA6GOtSQYLEIMBIGOTl IGmGWIYLSARIDGtdVYRPYTPISSOODKGFVDLYlmFr.DTHPKFPAIKKHSQYLEWlGDTI 80 90 100 110 120 130 140
150 160 170 ISO 190 200 210 EERGPKGlLVYQeKGCFAlRPDSKSDPVlKTVKSYGHlAMTGlTFHLQVIRAjmDPDOHTVCHUFAK EFRGPHGLL¥YOGI;6XFALRAI)KKSNPVYRTYKSVGMLAGGTGITPT