Gene, 110 (1992) 229-234 © 1992 Elsevier Science Publishers B.V. All rights reserved. 0378-1119/92/$05.00

229

GENE 06220

Structure of the murine iactotransferrin gene is similar to the structure of other transferrin-encoding genes and shares a putative regulatory region with the murine myeloperoxidase gene (Granulocytic differentiation; recombinant D N A ; polymerase chain reaction; intron; exon; gene expression)

Neelam V. Shirsat b, Susan Bittenbender a, Brent L. Kreidera and Giovanni Rovera~ "The Wistar Institute of Anatomy and Biology, 36th St. at Spruce, Philadelphia, PA 19104 (U. S. A.): h Tata Institute of Fundamental Research, Colaba, Bombay 400 005 (India) Received by J.A.Engler: 7 May 1991 Revised/Accepted: 6 August 1991 Received at publishers: I October 1991

SUMMARY The structure and nucleotide sequence of the murine lactotransferrin-encoding gene (LTF) deduced partly by direct sequencing ofgenomic clones in the Aphage vector and partly by enzymatic amplification ofgenomic DNA segments primed with the oligodeoxyribonucleotide primers homologous to the cDNA sequence. The A phage clones contained the 5' half of the gene corresponding to the first eight exons and an incomplete ninth exon interrupted by eight introns. Genomic clones. corresponding to the 3' halfofthe L TFgene could not be obtained on repeated attempts from two different mouse genomic libraries, suggesting the possible presence of unclonable sequences in this part of the gene. Hence, PCR was used to clone the rest of the gene. Four out of the presumed eight remaining introns were cloned along with the flanking exons using PCR. Comparison of the structure of the LTF gene with those of the two other known transferrin-encoding genes, human serum transferrin-encoding gene and chicken ovotransferrin-encoding gene reveals that all three genes have a very similar intron-exon distribution pattern. The hypothesis that the present-day transferrin-encoding genes have originated from duplication of a common ancestral gene is confirmed here at the gene level. An interesting finding is the identification of a region of shared nucleotides between the 5' flanking regions of the murine L TF and myeloperoxidase-encoding genes, the two genes expressed specifically in neutrophilic granulocytes.

INTRODUCTION

Lactotransferrin (LTF), an ~ron-binding protein with a Mr of approx. 80 000 is prese'i~t in a variety of exocrine /

Correspondence to: Dr. N.V. Shirsat, Molecular Biology Department, Tara Institute of Fundamental Research, Homi Bhabha Rd., Bombay 400 005 (India) Tel. (91-22) 215-2971; Fax (91-22) 215-2110. Abbreviations: aa, amino acid(s); AMV, avian myeloblastosis virus; bp, base pair(s); G-CSF, granulocyte colony-stimulating factor; HTF, human serum transferrin; HTFgene, gene encoding HTF; IL-3, interleukin-3; kb, kilo base(s) or 1000 bp; LTF, lactotransferrin; LTF gene, gene encoding LTF; MPO, myeloperoxidase; MPO gene, gene encoding MPO; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; OTF, chicken ovotransferrin; OTFgene, gene encoding OTF; PCR, polymerase chain reaction; tsp, transcription start point(s).

secretions (Aisen and Listowsky, 1980). Among the cells of hemopoietic lineages, however, it is found to be present only in specific granules of neutrophilic granulocytes. It exerts a bacteriostatic effect on a number of bacteria and the effect appears to be related to its ability to deprive bacteria of iron required for their growth (Arnold et al., 1977). Patients genetically deficient in granulocyte LTF tend to have abnormal secondary granules, functional neutrophil abnormalities and increased susceptibility to infections (Boxer et al., 1982). LTF is a member of a family of iron-binding proteins that includes HTF, OTF and melanotransferrin. Amino acid alignments have shown that in addition to the extensive homology between different transferrins, each transferrin also has a strong twofold internal homology, indicative of gene duplication from a common ancestral gene of half the size (Mazurier et al., 1983). Structure of the HTF

230 H

and O T F genes is already known (Schaeffer et al., 1987; Jeltsch et al.,1987). In this paper we report the isolation and molecular characterization of the murine L TF gene.

EXPERIMENTAL AND DISCUSSION

(a) A eDNA library The murine IL-3-dependent 32DC13(G) cell line established from mouse bone marrow (Valtieri et al., 1987) undergoes differentiation along granulocytic lineage when grown in the the presence ofG-CSF. A eDNA library was constructed (Gubler and Hoffman, 1983) from RNA from 32DC13(G) cells grown either in the presence of IL-3 or G-CSF for time intervals ranging from one-twelve days. The library was screened (Woods, 1984) with a synthetic oligo probe 5'-TAGACAGAGTCCAAGGGCCTC corresponding to the 5' sequence of the uterine LTF eDNA (Pentecost and Tong, 1987). A positive clone containing 566-bp insert was found to contain 15-bp long 5' noncoding region and the nt sequence corresponding to the first 183 aa residues of the murine LTF. A murine genomic DNA library constructed in phage it EMBL3 vector (Clontech, Palo Alto, CA) was screened with the 566-bp LTF eDNA insert. Two positive genomic clones (LTF-A and LTF-B) out of 4 x 105tested, were isolated and characterized by restriction endonuclease mapping. (Fig.l) The genomic fragments were subcloned into pBluescript (Stratagene, La Jolla, CA) and were sequenced by the standard dideoxy chain termination method (Sanger et al., 1977). The clones contained only part of the LTF gene corresponding to the first eight exons and an incomplete ninth exon representing the first 1171 bp sequence of the 2.4-kb L TF message (Pentecost and Teng, 1987). (b) PCR cloning of the 3' half of the L TF gone Repeated attempts to obtain genomic clones for the rest of the L TF gone from two different genomic libraries were unsuccessful. This raised the possibility that the nt sequences corresponding to the 3' half of the LTF gone were

H

I

1

H

H

X

H LTF-A

X

H

I I ,~'-B 1 kb

Fig. I. Restriction map of the two overlapping genomic clones LTF-A and LTF-B and the positions of the identified exons and introns of the murine LTF gone. The width of the boxes denoting the exons in the lower line bear no relation to their sizes. The sizes of the introns are to scale. H and X represent HindIII and XhoI restriction sites, respectively.

either underrepresented during the construction of the libraries or that they had been selectively lost due to the presence of unstable or 'poisonous' sequences (Wyman and Wertman, 1987). Hence, we decided to use PCR to amplify genomic DNA segments corresponding to the 3' half of the LTF gone. Comparison of the structure of the L TF gone so far determined with that of the HTF and O TF genes (Schaeffcr et al., 1987; Jeltsch et al., 1987) indicated that the LTF gone presents an intron-exon distribution pattern similar to that of these two transferrin-encoding genes. Therefore, we cloned the 3' region of the L TF gone using PCR assuming that the remaining exons of the L TF gone are similar in size to the corresponding exons of the HTF and OTF genes. The oligo primers were synthesized from the known sequence of the uterine L TF eDNA (Pentecost and Tong, 1987) in such a way that each pair of PCR primers would allow cloning of an adjacent pair of exons along with the intervening intron. PCR was carried on genomic DNA from 32DCI3(G) cell line (Saiki et al., 1988) and the success ofthe amplification was determined by Southern-blot analysis of the PCR amplified product. The blot was hybridized (Zeff and Geliebter, 1987) to a 32p-labeled diagnostic oligo primer (a primer corresponding to the nt sequence of either of the two exerts nested inside). We could successfully amplify four out of the remaining eight introns along with their flanking exons.

Fig. 2. The nt sequence of the murine L TF gone and the 5' flanking DNA region. The lower-case characters correspond to intron sequences and the position of each oxen is indicated above the sequence line. The nt sequence obtained from the genomic clones corresponding to the first nine exons is numbered by numbering tsp determined by primer extension analysis as + 1. For the PCR-generated nt sequence in each case the first nt of the 5' oxen is numbered as l. In cases where the sequence data are missing due to extremely repetitive regions in introns the numbering restarts with number I after the gap (denoted b)' dots). The nt sequences which share identity with the known cis-acting promoter and enhancer elements are underlined and their designations are indicated above the sequence line. Abbreviations used are: APS, acute phase signal; IAP, inverted acute phase signal; Inv. Cat, inverted CCAAT element; ERE, estrogen responsive element; $8S, Sp l-binding site; SV40 ENH., SV40 enhancer element; TATA BOX, TATAA element. The location of the two large DNA regions capable of forming Z-DNA are underlined and indicated as Z-DNA above the sequence. The nt sequence of the L TF gone which shows 82 % homology with the MPO gone corresponds to the nt positions -1025 to -827. GenBank accession No. for the nt sequence data corresopnding to the first nine exons and the intervening introns is M64423. The PCR-generated nt sequence data corresponding to each exon pair with intervening intron is deposited under a separate accession No. The GenBank accession Nos. for these data are M64424, M64425, M64426 and M64427.

ttgggga¢¢¢tc¢atgtttga~t~tatgc~t¢ttagt~t~¢tg~tg¢tgga¢tc~acatctaa¢~cagg¢at~tata¢agg~agg~tagct¢agt

GAACGAGATGAQ~tAAAGTG~;~TGGCCcGCCG(~TCAGT~GTGTCAAGAA&TCCT~CACCCGCCAGTGC&TCCAG~TT~Ggta&g¢taa~a~t~¢aac •cgagtggaggaaag¢aaggagaggtg•g•agtc•aa••a•aagttcgagtct•tggt•c•t•tgctt•a•tt¢•tggg¢accagt•agttg•g•g•ag• ~tttg¢t~acagaaa¢¢ggtgt~t~ttgatat~taaagaaa9a~aatga~agg¢~t~ttatggggtaa~tggggt~ttatt~caaaattgtt~atagga

a¢~gt~gggtag~g~tgg¢tagga¢~¢¢aga¢&tta~a~ga~a~tg~tg~¢~c~g~ac~tatgttaaaa~tagagat¢gtag~ttt~a¢ttg ga~t~gg9t~t&ggaagtag~ga~tggaggt~aaa~ama~tggtgag¢t~caggtttgg~aaaaga~tttgt¢t~aaaacgtaagtcagaaagaUatg gaggaaga¢a~agaa~t~gag~t~tgg~t¢~a¢a¢gtt~atQtcct~¢agtgtgtgagcamacacaca¢a~a¢a~a~a¢a~a¢a~a¢~ctgcgca~at ¢tacaaga¢tgaa~at~¢a¢atgaag~aaa~ta~gt~ataa~tatgt~at~¢aaaag¢tttaaata~agaa~attgt~aca~t~tgac¢aa~agt~¢ t~aa~tggacag~atac~taggggtcaca~a~u~tt~ag9a~ttaaag~c~tgg~ggtg~tta~t~t~acat~gaatgacc~t~tgtg~tgtt gt•a••ttctaggtagga¢atat•ga•aata¢gaag•••taa•t•att•••aagtc•tgaaatctt•t•tc¢tgaag•atgt•tgga•ccatttt•ttt•

1653 1~53 1853

2053 2153 22~3 2353 245~ 2553

3553 3653 3753 3853 3953 4053 4153

2953 3053 3153 3253 3353 3453

¢tgaag¢tacgg~a¢ttttgtagagtga~tt~¢¢taggtt~tag~tg~c¢¢tag¢a¢&tttg~a¢ta¢agcatg8¢ga~aaagagc¢~catggt~gg

t~t~t¢~tatt~tt~g~ttatc~t~taa~tt~t~c~t~tttt~atc~cat~tg~tggg~gc~attgtgg¢cttgatatttagt~aa~t¢ a gg ¢ ~t¢ t¢t ~t~ gag ~ agg a ga9 g gtg a~t ~tg t~ tga ~ tt ~t~ t~ tgg ~ t~ tat ag~ t gt t g~ ¢t agg¢t ~ ag~ a¢~ t ~ ¢t ~ t ~ t a0t ~ t t t ~

tt~tc¢agc¢c¢ttagagg~¢~a~aagg¢~a~a~t¢tatat~a~tttg~Cgga~g~t~a¢c~ggtt~tgg¢cc¢tctgat~ac¢tcac~c¢tgattt

•t•t•a•t•tag•¢a•aag••a•tttgggatgg•ttt••atat••gat•agt••aa•gtt•g•caga••gagt••aat¢aa•ta•aa••agg•••cat¢a ca~¢aacaa~t¢¢gg~agc¢g~¢~gtag~tt~t~t~g~aaagatg¢acgtctgagttccga~gc~ca~tcag~t~tgt~¢tgtt~tg~tggcatgga ggagg~gg~t¢ttggt~cgtggagggtgga~gggtgg~t~¢~t8gg~8~caggactg~agtgg~t~ggat¢gg~tcatc~tctgt~g~¢~¢tt¢t¢t EXON 5 • gtgtttga¢~g~GGTAT~;a~GT~CTTCTc.J~A~GAGCTGTGTTC~GGTGC~CRARAGGATAGATTC~Cc.£A~CTGTGT&GCTCGTGTGCAGGGA~&GGA G~TGTGC~TCTT~CC~G~GG~tG~C~TACTC~GGTT&TGCT~GCCTTG~Ggt~ag¢ga~att¢t~tc¢t~¢cccac~cccat~tc~ ~acag¢tt¢agg~¢aa9gtg¢gg9atgga~atttg~t¢~ta¢tgatgtg¢aga¢ggagaa~¢aag~¢taagaagtg9t.tggtctctcta~ta¢atgtc¢

TA~GAAT~0~qATGG~CCA~GCATC~C~fTGAGt;;AAGgtaaaa~gg~ttgtgggggag¢~g~ga~a~atc=t~agtg~t~aag~tgggaatttcta

GACCAAAGAGCg~gagt t~ttttt~t~tgg~cg~t~aaa~t¢gtg~t~cgt~gtgg~t~tga~tggtttctgtQgcgtggg~gtggacgagtaaa~¢ tagtcagt goot ~t¢a~tggggaa¢agagt~t~tgga~tggatga~aggt¢~tc¢tggtg~ttgggt~t~ttg~¢atgg~ag~ttggg~agagtgga EXOH 4 ttctt~a~ctcc~ct~atc~gg~taagc~t~tc~a~t~c¢t~tgtt~t~t~ca~&GC~G~TC~TA~T&TC~GGTAG~>~GT~AG ~AGTAACTTTc&TCT~CcAACTCcRAGGCCTG~GGTCCTGCCACA~GG¢ATTC~c~GGAGTGcGGGGTG~TC~ATAG~CTT~GTCC&

GGC~GCT~GT~&CG G

2753 2853

F.~O~ 3

a~Ac~AR~AG~GCTG&TG~CATGACTCTTGATGG~GGCACT&TGTTCGATGc~GGR~AGC~CC~T&CA~%~TGCGACCTGT

2653

.

actat•tttttaaaag¢¢taagcagccatatttgatgaatatgtaatgtct•taat•¢¢ttgagaaataag••tggtgcagoc•tagatg¢•atat•c•a

attccatctgt~t~c~CTCTGTCT&GCTA~GGCARCRACTGT~G~TGGTGTGCTGTGT0J~ATT~TGAGGR~GA~TGTTT~GTGG~&

1553

1953

ttg~aacattaagatc¢agtgtg¢a~atgagtaaaa¢atgag¢~a¢g~ttt¢c~tg~Et~¢~tg~¢aggtt~at~taga~aagg~tgatggaggt¢aa taaaatgat~tttggagtggat~at~tttgggggtggggtg9gtgt¢tgagt¢t©atc¢ct~tgtggtccag~ag~ttag~ttaactagagtacagtggc tgaacaaga~ca~gtaaca~ggagggtgggg~agtga¢~a¢~ggcagt~ta~agaatg~cct~.~gacat~ggggaggtcttcc~¢aa¢gag¢a~tgctgg I~ON 2

1253 1353 1453

1153

1053

aag~agatttat9tttttatttctatgag~aattg¢~aa~tgt~t¢~aaagtggcta~at~a~tttacagt~tcagcaatgtaagaaggttt~catgt

ccaqgatag~caagg¢ta~gtagaga¢t¢aam~aaaaaaamaaa~¢aa¢aaaaaagtggagtt¢t~aatattttgtcttggatttggctt~ctt~at~t SV-40 agt~atgttatatgtatgt0tattattttttatataattgaagatattccattgaatgatgta¢tgcattttcttcag¢cattcctattcac~a~ta~tt ENR. tqga•t•atgta9tta9t•ggg•ttttttg•tgtt9t•gtta•tttgagacctt••tcctatctaaataatgctactatgaatat••atgctagatcttt

953

c~a~aqqcaqaqq~a~a~qcaqaqq~a¢a~a~aqq~a~aqqca~aqq~a~a~ca~atct¢tgtggagt~maggtcagc¢tgct¢tacata8tgagttc

853

agaactcactcaa•tggaacta¢•agcttt•tatctataaatttatcttttctggtcgagtgtggtggtg¢a¢a•¢tttgattctagca•ttgggaag¢a Z-DN& ~a~ca~a~ca~a~ca~a~ca~a~ca~a~a~a~caqaq~ca~a~ca~a~q¢~qa~a~a~¢~a~qca~a¢~ca¢e~ca~a~¢a~a~

753

653

tctt•tgttatctagggaaga¢tgttggg•t•agagagggtgagatgtt¢•tgtgagtcccagttca•tga••tc¢¢aagaagg•ttctttgtac•aa¢a

aa~ta~gggggta~ga~aaaaat~cactg~aggat~gatggggtga¢c~agtg~gta~aggtgcttg~t~¢caaga~tgaaga~aga~gtc¢tagt~c

353 453 553

ggccagtggt•¢•ggtg¢¢•aa•¢g••taggtt•aaattctagagtcaacacaa•ttagctggaaagt•ata•aa•t•ggtaaag¢tacacaatg¢••ag

T~CTT~CTTGATAT~T~T~GAGGC~T~Gg~a~gtg~aggtg¢~¢agaggtagggaqq~q~¢~q~qqtca~tgctqqqaag~tggg~a¢taaatgca~

253

gtactgt0a¢tctgatcctgcagaagc~gggtggagat~aaggaa"at¢a¢tcggtttc¢~gta¢cag¢g~gtgtagggggta¢tggagt¢c¢tgttt¢ Inv.Cat ctcctt~tgggct~caggaagctgg~t~taagaa¢tagcacac~tggttgagggcaatgggg~tggaagg~agg~ctattq~gcaatagggtggggcca TAT& BOX EXON 1 +1 gcccggt~aggtcacc¢agcaca~taaagggc~¢¢ggggagaggcagAAGCCAGG~TTGTCCTcTAGG2CT~CAAGA~AcAGACATGRGG~TG~T~& SB8 &PS

ag~ttaagt~t~a~aqqtcaa~qtaa~ca~aaata~agac~¢~ta~¢~atgt~a~t~tagaaagta~tgga~a~agagaaag~aga~ga~ttgg

agg•ta•t•tcggggg¢tgtatggcggg•ttcaaggcagtgtggac•¢cacaggaaccctgtgtg•aagtctagg••gactcc8•tctc•tg¢ggctg•c accggg¢tg•tgttgt9gc¢agg¢¢tgag•ag¢tg¢•t•ttctttagaatccac¢a•t•tttgtctagccaaggaggaaggggatttg••tg•t¢catg¢ ERE

tgaaagaaaaggag~aaaagaat~eatgtcagggt~ttcctcg¢tagcaaatgaagga¢~caggt~aa~tgggtgg~a~t¢~tt~tgaggtcct

cagag¢attaga~t¢agctcc~a~a~tgtgt~gcaagg¢tcat~cataactgc~tgtac~tccag¢tccagctc~ag~tc~ag¢tc~agagatgaggt gtctgaac~t~tggc~tccatggg~agctg~attcatgtgcacatac¢ccca~cat¢accc~taaacataca~g~ataaataattc~aaat~aatatatc attaaaatgg¢ttttatttttttga~agggttt~tctgtgtaa¢¢~tga~tgtc¢tggaa~t~act~tgtagac¢agg~tgg¢¢tgaa~t~aga~t~tg cctgtct~tgtct~ccaagtgctggg~ttaaaggcatgcgcca~caccg~ccggctc~t~at~t~ata~tatga¢tctt~gacatt¢~ttaaaaagaag Z-I)N,q ttctcatt••tata••acacatgtacacaca•acaca•aatctct•t•t•t¢acaca¢acata•a•a•a•a•a•acacacacaca•acacacagact¢ac a~aca¢aaata~a~agacacaaaaataga¢a~ggt~acagataga~at~ga~a~agacagacacacagaggaagaggaaaaaaagatatggaggaaggaa

ccgata¢ggtta~ata~t~t~tGtg¢agg¢c~mat~F~c~a~gtatgtatggg~a©~a¢a~ttgaggtgatta~a~a9¢tc~a~c¢g~¢~c~g caaataaca~atttattgctt~¢ggtgggtgtgtggcatg¢atctgtg~tgaag~c~attgat~tgag~tgtg~at~a~gg~aaca~a~ttatgag ctgtggt~t~agaagcagtag~tag~ttttgaattaaaaagtgacat~atcatggagctggagaggtgg~t~agtggttaaggatg¢a~a~tg~tcttg

g~aat¢a~agagaaggttcc~gtgg~gga~tgaggotc¢~t¢aggotaacac~tca0ta~ttcatgtgtaa~ct¢ttgagctaagttta~atcttgta

cagtg~tggctta~tagggta~tt~tg~t¢cagamat~tgtt~ttattg~agtgatttmag¢t~tgtc~atgt~tg~ctctgaagaama~a~agatt

atttgac¢t¢atgtttgtatct¢attaaga~t~tgc~agg~atgagagg~t~a¢aggtt~aag~tggg¢ctg~t~ttagggttgagtgt~¢cttgg

tttc~aatgagggacc~tggatctgt~a~caaaaccc¢ag~tgggatt¢tagcac¢aaaca~agact¢~a~ttttgtg~ggggcaaacaggcaagcctg

ttgcacccgagagtt~gc¢aactgtgtgttt¢ttcacaatttt~tgg¢tcgttg~aattatagatgaaag¢aa~ttctgacgttgcattatattg

aagctttcacatcattagaatt¢aaacagg~agaaag~agtact~aactca~agg~tcat~aggttattgaccatgcctc~gaatcctggggggca~ ZAP ct~c~catatttttcatgggttt~tgtctcatctt~c~q~tt~atattgOatatta~atattacattgaa~gttacatattacatattgaataa¢cccag actaaaa¢ttttatgt~taaggaa~agasgtt~aataa~matgtacc~atgta~taatatgtaaactagag~agaaatttaacgttaaaggagaaatc tttatgttaatggagta•¢gt•tc¢tgagaag•gttgtgataa•ta•mmcc•a•••••attgat•attaa••magtma•tg•cccataga•ttg•ctg• aggcagtctta~caaggcattttct~aagtaagattct¢t~tacc~agatgtat¢taggtttgtgt~aagtggt¢aaa~¢aa~cagtgc~ct¢~aaggc IAP ca~cctttt~taattggcact~agagatgtgggttgtgatacttcc-G~-~tcttttataatcacaaaaaggcaagg~agtt~agata¢aaagatg¢ttca

53 153

-48

-148

-348 -248

-848 -748 -649 -548 -448

-1048 -948

-1148

-1248

-2148 -2048 -1948 -1848 -1748 -1648 -1548 -1448 -1348

-2348 -2248

-2548 -2440

-2640

1501 1601

1401

1301

101 201 301 401 501 601 701 801 901 1001 1101 1201

I

101

1

301 401

1 1Ol 201

901

501 601 ~01 801

401

1 101 201 301

1O01 1101

I 101 201 301 401 501 601 701 801 901

6753

6653

5653 5753 5853 5953 6053 6153 6253 6353 6453 6553

5553

5453

5053 5153 5253 5353

4353 4453 4553 4653 4753 4853 4953

4253

aaga~cat¢ccagggcagagaaaaaaacactg~tgc~tttta~ctct~t~tta~aaaatg~ag~a~ccttat~attc~¢~c¢c~c~ac~ctctg EXO~ 16 a~ta~t~acagaacagtccact~t~ttgacaca~GTTCAGTTT~TGGAr~&GGTGT~cAGG&G~0TTTTGCCTGTTCCAGT~~ AAC~TTCTGTTCR&TGACRAC~cTGAGTGTCTGG~cJU~G~TC~CCGC~2~U%CCACAT~GG~G&A~AT~GG~GTACG~T

GRGGCT~~GCTGTRGTGTCTCGGACAGACARGGTGGAAGTCCTTCRGCAGGTGCTGCTTGACCA&~ •ta•gga•cagcagggctcc•ag•atg•t•tt•a•tg•a•tgg•tc•tg••ggg•gggggttcatc•a•tcag•••ttt•ttt•t•g•cttt••g•t tga tcgatggggata9cgtttg•t•catgct•tgagtctt••t•ga¢ttctg•ct•ga•tctcggctcccuac••g•ac•••ctgacctc•c•cccctgctg cttcctctagt•actcagagagagg•acgaacagt•••ca••acaga•gctgt••g•agtaaatcca•agtg•gttccagt•tca•g¢g•cacga•gtc tttgt•ctcacagattasatactaggaaaag•••gaatattgaa••t•tg•cgtt•tctctt••aaa••••g•t•tatgac¢t•ttttt•¢ca•ga••¢c tg~aacaggcgaggttmacga~ctggcgacttctaca~tatctgtat~c~c~g~catt~ggatccca~¢tatatgc¢tgcctca~tca~tctataa¢at tatcttct~tga9cct~ttatcc¢tgagc~tggttttatagaacc~t~t~at~a~gatg~tt~tt~tt~c~a~agtt~ttca~gtaa~tg~c~ taagctttgttgtataca¢agga~9ttttttc~c~ttmmttgtact~tg~tta~tatg~tggaa¢a8a~tacccagtgt~a~ggttttgaa~tt tga~taacactggctactgagttgtgttgaaccaaa¢tt~ctt¢ttgcct~aaa~acagtg¢tgg~¢~tggtgt~tgta~gg~g¢¢¢¢g¢gt~t~t~ga tttagggatggctgggtt©tagtgttc~tgtt~tagtttctattttcct~tca~agcat~ttggct~t~ggtaca¢tttt~agg~caacagc~t~ac~ ctaa~ga8aggg~t9cagccgagagggctcaatg~¢~tgtggtgg¢¢¢~ct¢acagm~acatggg~atttttaaatt~ccgtttg~acag~tgga~a ¢tacaatgttcta~at~tcagcttttgag~gammt~t~a~a¢~at~tt~gtctgaa~t~tmatr~a~a~a~g~¢gtgatma~agattta~t¢amagg aaagatac~ggaggr~iaggtca~aag1:g~tgt~ag~t~a~c¢~t~(~gaagt~at~ct~ca~g~agca¢ag~actgggag~a~ggtg~tggg

c tgt ct t t t ~ ¢ ~ g G G R A G a ~ C A C T G A ~ G A G T G G G C T A G G R A C T T R A R G C ~ G R ~ G G A C T T T G ~ G c T T T T G T G C ~ T G A T ~ C ~ C T G T ~ T

EXON 1 5

gt~ttt~t~tgt¢~~C~.~GTGCCCCTGGTGC~G~a~CC~J~%T~c~TCT~T~TGCCcTGTGTATTGGTG&TC~GRRGGGT GRGA~GTGCTC . E ~ H 14 CAGGR1~TGTTGC~G~GTCT~GC~%ATACTGRCGgtat~t~aagat~ttt~gcttcct~cttcag~agca~t~ttta0cat¢~ 100-200bp catgtaca~c¢a~tatccc~gc~a~tggggct~gga~c~tgaa~ga~atacgtagtaacc~att~c¢~actt~gact~aaagaatt~ ........

atgactatctgatggsatct~atatcct...about200bp..

c~gata~tgggctggagagatggctcagt~gttaa~agea~tgea~tg~t~tt¢~ag~ggtt~tgagt~aattc¢cagcaa¢cac~tggtggtgggctc gg~a~cttaattgagawicctgg~cttcta~ttct~attttaaa~at~aaaaga~gttgtggaggcaaagagg¢¢tat¢tgtg¢aggt~cggc¢ttct~ agacaccagcc~tg~ta~ccta~¢catttgtacagt~tgaagcct~tctgattgccagtgggtctttggttggtggtagaatg~att¢tgQggtga~a gtat~ctga~tgtgct~c~acgtac~g~g~tgacat~cg~c~gc~atctg¢tgggtssagact~tgtgt¢gactgtggt¢~ga~g~a¢taa~tt~tt

tctagtaagatage~;~tgaatttatttatttatttatttatttatttatttatttattatatgtaagta¢a~tgtagctgt~ttcagacact~gaagag ggagtcagatctcatcagat~tcattacagatggttgtgagccaccatgttgttg~tgggatttgaact~tgga~ttcggaaga8cagt~aggtg~t¢ ttacccactgagccatctcaccagcc~tag~ttgaattt~tmatgtagg~aagaatg~gtamtgggg~atttttc~aaatQtttt~agggat~a~a~a

aac~ttatg~gt~agr~gttag~tgaattgt~aacaat~agatgtttaggttttttgttttgttttggtttggtttggtttgttttttt~tcat~gta~a

EY~H 12 GCTTCACCTGGAGCTc2TTGAGAGGC~U~/~GTCCTGCcA~ACT(;CCGTGGACAGGACcGCAGGCTGGAA~T~G~G~T~GAC ~AGAT~cTGcAj~aag¢atatgggaggtt~tcacggggtggt~atcctgggtctgggagtgagtacttgaggctgtggaaccttag9gga~at~ agagggca~gacggttactgactgcgtgtcgc~tgt9ggacctcttttagaataattaggttctc~tt~ggtcgcctcatgg~c~atgggtgcacagg atgtttcacacagaaca¢ttcctaagcttcccc~tt¢cacccca¢cmactgggaga~¢cagatcgcaatagca~ttmat~msacattaagatatgctc

GGCTTG

atcgtattgtgggcatttgmmgtttatgg~tctcaggmmt~tgagttgtcagtaag~tttg~ttgt¢tttgc~¢cttc~g~ATCC~CAAAAGcAAT

GGAGGcTATAT~GcGGTTTAGT~CCAGT~TTGGCAGAGRACCAGAgtaagtggagttggtgaccacaggctatg¢atatgaa~t tga~ttgggagtagagaggtgttaccccactcactagat~ttggagaagatggaga~agaatcagaag~aggaatasatgaaagggg~atga~accca~ t t c ac c ~ at t c c c t t agat t g~ c agat c at g a t a g ttta c tt~ttg a tta m~a a a a a ~a a a g a c a t¢ ~ttg g g imtg g t g a g a t g g ~t c a g ¢ t g g t a caggtgacaa¢ggctgccaagctgttg~cctaagtttgatgc~taagac~a~atggtggaag~t¢c~t~t~ca~mctgtatgcagcc¢t¢tatg¢ agtg~acaca~aggt¢atgt~ac¢tgacattatgtatataamat~ttattat~aatatgt~ttattaa~atatatgttgca~ac~cagagcatgcatg tgctgacaacaggtacat~ttgacaacttagcatgatgctactag¢atgmitaca~tgatag~taaatgtac~ggcgac¢¢gatgaagggt~ag¢~a8g cccagaagtcaccctgtgc~tg~tca~r~ta~tt~aaaagg~ag~g~t~t~ctg¢~tt~cmmagacatag~t~¢¢ttg~t~t~ctttaagtt aatgg~gmattgagcatttgatgtctgctgtgccttg~tcagcattatttttgtgacat~:a~c¢~¢~m~cgtgtgg~tgctga~tagccagtc tcgtggctgcataacgtt~tattgtgtgactgggtgg~gtg~tt~t¢catt~t~ttcatggatgggtgttt~agf~gattt~cat~tggtggttattgtg aaagatacgctacagccatccacgaccacgtctatgatgm~matgtgtatttatttctgctgcattgat~tac¢gaggcgtt~¢¢tttgagtttgat~tt ZXON 11

EXON10

EXON 9 gagtgaggr~ag~gctgagtgttcctcacr~ctgt~¢t~¢cr4t~agA6CAGCAGGATGTC~TAGCCT~~T~~ GTG~GG TC

~tttgagggc~gr~Jatgcggggaa4gagaaggt~tc~a~agatt~¢ggt~tca~aag~t©tgggcagga~ttgatcc~a¢ggaggg~ag¢ggtg~

aac••tggctatgtttatcctacttgggaggacttagagagggagtctga•tgaaccaggagactgtc•tgtgacatctg•tggtgacagagcct•atgg gtgggtggtcgggctcagtcaaggtcct~taagr~agg~tttggggtata¢actagtttccatgt¢a~gaagcttttattctgaggg~cagaa~t~agg cact¢cac~cat~-e~utggactgatgg&tgaattagmig¢¢agggaag¢¢~t¢tgaaagag¢agaaggagm~agaaag¢~tt~gg~¢aaatcca

aagaagaatctgggaga~tactccagaggctggagt~g=~cccctt~tgtgctggat~cg~t~gg~t~aagagatta~aatggg¢tgtagagatga~

gttt~¢¢aca~t~Jggttt~Jattr~ttr~tg~tactgctggtga¢¢ataatt~tattttctttccttaaacagG~tG~G"F1~GGNMj&~C~&TC ~ C C~J~GG~GT~GCC~GG~G~C~GGGTTCCC~&GI~GG~&C~TGT&GGG CTCTACCTGA~C CTGAATAAAAgtaagtgac¢ccaggcagga~ccaagggtactgagccagcagtacctctgg g~gggtggacctctgctctgggagggtct~tccg¢tactggtgggcgtgacgcacata~tgtgtgtcaagtaagt~ggccat~ga~aatgt¢tc~ aaaaaaagtcccaga~J¢tggacteatg~ttc~atatt¢~t¢gtgtatg¢~aggat~c~aggc~tcttattsagtn~aaag~gaggcatggactaagaa

tcag¢~cctagcag~tttcctgcgttccgcagagg~aggggtgggta~cggatg~tatt~ttaggtcccctaagga~ttgga~aaaa~agtttt¢tga tgaggtaagttatcgttgagctcatgtctctgc4atgagcagagt~¢tgggag~t~ggatatcaatt~catt~g~c~cccaga~ccca~gtag~

EXON ? ¢ggctttac¢tttttgagtg~tcaggaagagtg~t©at~ttcacacgtttccctt~ctcccagAGG~GTT&CC~T&~C~CG~GG~CCAGT&C~ GCTGCT . . GGAIWCCGGTGACAGA&TACLqGGAGTC~CACCTC.,GCCC&AGTCCCTTCACATGCTGITGTA~CC~C~T GACN~GRP, ~AGT~CAGgtacagctamaatgcacagcaccccttgcctgcttaggtttaggggtggggagggcct ctctttcccta~ttcag¢agttgagagagttccatagtttmagga~atggg~¢¢tgact~t~ttgga~agtagaga~ataggaaccttctt~agtg

ga~agtagaagaacgatccgtggaa¢atgt~ta~catatttaattaa~gt~t~tgtcagt~taatga~aga~agacagggcg~ggcggtc~t~c~t~ttt

ta¢tgaaca¢tagaagcagcg~Jagcccaggg~ttt~caat~ag¢attgggggt~m~gattg~t~¢gtggtttt~t~¢~cg~gtaactccgtggttct

tgtgc¢tgatact~aagg~accagaagagggtgtcagatcc¢ttggga~t9gag~a¢~ggtggttgtg~g~tgct~tgtgggtg~tgggaagtgaacc caggtttt~tgtgagagtm~ctgag~acct~tccag~tac¢tcat~t~tgmmt~tc~ca~tt~ta~¢¢tngt~ttgc~¢t~acagaca ctctcaccagggag~ctcgtcatgtg~ttg¢caggtgta¢acgcctctg~accatm¢tt©a¢tggcttgtgt¢t¢tggattctgagttcaca¢agcttt

a~tctt~ttttaggtttttctttti~aatlu~ttteagat~aattatgattttar4tgc~tgggtgttttg¢¢tt¢gtgt~tgtctgtgc~¢acatg

EXOH 6 ggaca~t~qt~tttaa¢tg~gtt~tt~t~attt~a~agGTGT~TGAGAGAC~ATGCTGGAc~~~C~AGA~~aag ggcaagggtcagg~tgtgggtgtggcca~tcttammatttat~tgtgatgtgcgggtca~agatt¢~¢~agttgtgat~magttg~tactggg~ca

232 The failure to amplify the rest of the introns could be due to either their large size or to the presence of repetitive elements which hinder successful amplification by Taq polymerase. The PCR amplification products were subcloned into pBluescript and were sequenced using the standard dideoxy chain termination method (Sanger et al., 1977). A total of 6776-bp sequences corresponding to the first eight exons, eight introns, and an incomplete ninth exon and 2648-bp sequence of 5' flanking region of the murine LTF gene was obtained from the two genomic clones LTF-A and LTF-B (Fig.2). About 4.2-kb sequences corresponding to four introns and their flanking exons in the 3' half of the LTF gene were obtained from the PCR amplified products (Fig.2). (c) Organization of the LTF gene The successful amplification of the most of the 3' half of the LTF gene by PCR suggests that like the HTF and OTF genes, murine LTF gene is orgmized into 17 exons interrupted by 16 introns, The sizes of the exons of the LTF gene closely match those of the two other transferrin genes. Corresponding exons from the 5' and 3' regions of the LTF gene are homolgous both in size and sequence as in the case of OTF and HTFgenes. Furthermore the intron-exon boundaries fall at identical or almost identical aa positions for the three transferrin-encoding genes, strongly supporting the hypothesis that the present-day transferrinencoding genes are originated from duplication of a common ancestral gene of half the size. The sizes of the introns on the other hand differ for the three transferrin genes, resulting in the wide differences in their total sizes. The total size of the HTF gene is about 33.5 kb while that of the OTFgene is 10.5 kb (Schaeffer et al., 1987; Jeltsch et al,, 1987). The routine LTF gene region cloned from the genomic library and that cloned using PCR together has a total size of about 12 kb, (d) The transcription start point (tsp) and the 5' flanking region In order to determine the tsp primer-extension analysis was performed on total RNA from 32DCI3(G) cells grown in the presence of G-CSF for nine days. A 56-bp DNA fragment was obtained on analysis of the extension product on polyacrylamide gel (Fig.3). The 5' noncoding region for LTF mRNA was hence defined to be 39 bp [56-17 bp] long. The 5' flanking region of the L TF gene contains two common DNA sequence elements presumably involved in the proper transcription of many eukaryotic genes viz. the TATAA and the CCAAT elements (Efstratiadis et al., 1980 ; Maniatis et al., 1987). An imperfect TATAA box (GATAA) surrounded by G + C-rich sequences is located between positions - 2 5 to -21 while an inverted CCAAT

AT G C

Test

Control

Fig. 3. Localization of the tsp by primer-extension analysis. A synthetic oligoprimer(5'-GGGATGAGCAGCCTCAT)complementaryto the nt sequence in the 5' region of the LTF eDNA was 32p labeled using polynucleotide kinase (Maniatis et ai., 1982) and was utilized for sequencing (Sanger et al., 1977)of the 5' flankingregion of the LTFgene (5-kb Hindlll fragmentsubcioned in pBluescript). The same primer was also used for primer extension of total RNA from 32DCl3(G) cells allowed to differentiate in the presence of G-CSF for nine days (Test lane). To 25 #g of this total RNA in 10 #! annealing buffer (250 mM KCI/10 mM Tris.HCI, pH 8.3), 5 ng of labeled oligo primer was added. The mixture was heated at 80°C for 3 min,40 units ofRNasin (Promega, Madison, WI) was added and the primer and RNA were allowed to anneal at 50°C for 1 h. The extension reaction was performed at 50°C for 45 min (Geliebter et al., 1986)by adding 3.65#i of the primer-RNA template to !.65#1of 2 x reverse transcription buffer(48 mM Tris.HCl, pH 8.3/32 mM MgCidl6 mM dithiothreitol/0.8 mM dATP/I.6 mM dGTP/0.8 mM dTTP/0.8 mM dCTP/200 #g per ml actinomycin D) and 10 units of AMV reverse transcriptase (Saikagaku America, St. Petersburg, FL).The reaction was terminated by addition of 2 #1of 100% tormamide. The reaction products were analyzed on 8M urea/6% polyacrylamide DNA sequencinggel. As a control, total RNAfromNIH-3T3, a routine fibroblastcellline,was extended usingthe sameprimer (Control lane). The tsp is denoted by an arrow (Test lane). element is found at the position - 6 6 to -70 (Fig.2). These features correspond closely to the features of the promoter regions of both the HTF and OTF genes (Adrian et al., 1986). Computer-aided searches of the 5' flanking region and the first intron of the LTF gene revealed the presence of nt sequences which share identity with the following enhancer sequences: SV40 enhancer, Sp I protein-binding site (Dynan and Tjian, 1985), a sequence element common to acute phase reactant genes (Adrian et al., 1986) and ERE (estrogen responsive element) (Walker et al., 1984)

233 (Fig.2). The presence of these sequence elements is interesting since the LTF levels are known to increase during inflammatory conditions (Hansen et al., 1976) and since the expression of the L TF gene is also known to be stimulated by estrogen in mouse uterine tissue (Pentecost and Teng, 1987). In addition peculiar nt sequences made up by the tandem repetition of a few bp units which could possibly form Z-DNA structures have been identified within and upstream from the L TF gene (Fig.2). The authenticity of any of these potential regulatory elements, however, requires in vitro or in vivo expression analysis. An interesting nt sequence homology was found on comparison of the 5' flanking regions of the murine L TF gene and murine MPO gene (Venturelli et al., 1989). An approx. 200-bp region about I kb upstream from the LTF gene tsp is about 82% homologous to a sequence about 2 kb upstream from the first exon of the murine MPO gene. Multiple tspoccur 5' upstream from the exon 1 ofthe MPO gene (Venturelli et al., 1989). The sequence of this 5' flanking region will be reported elsewhere. MPO is an enzyme found in the primary granules of polymorphonuclear granulocytes. The expression of the MPO gene is restricted to the cells of granulocytic lineage during their early stages of differentiation. The sequence homology in the 5' flanking regions of the two myeloid differentiation stage specific genes suggests possible involvement of this sequence element in the regulation of granulocytic differentiation.

(e) Conclusions (1) The structure and the nt sequence of the murine L TF gene is presented here. The 5' half of the L TF gene was obtained from genomic clones. The 3' half of the gene was obtained using PCR as no genomic clones could be obtained for this part of the gene. (2) Comparison of the structure of the L T F gene with those of the other transferrin-encoding genes, HTF and OTFgenes indicates that all the three genes have a similar intron-exon distribution pattern. (3) An approx. 200-bp region of high homology is found in the 5' flanking regions of the murine L TF and MPO genes, the two genes expressed specifically in neutrophilic granulocytes. REFERENCES Adrian, G.S., Korinek, B.W., Bowman, B.H. and Yang, F.: The human transferrin gene: 5' region contains conserved sequences which match the control elements regulated by heavy metals, glucocorticoids and acute phase reaction. Gene 49 (1986) 167-175. Aisen, P. and Listowsky, I.: Iron transport and storage proteins. Annu. Rev. Biochem. 49 (1980) 357-393. Arnold, R.R., Cole, M.F. and McGhee, J.R.: A bactericidal effect for human lactoferrin. Science 197 (1977) 263-265. Boxer, L.A., Coates, T.D., Haak, R.A., Wolach, J.B., Hoffstein, S. and

Baehner, R.L.: Lactoferrin deficiency associated with altered granulocytic function. N. Engl. J. Med. 307 (1982) 404-410. Dynan, W.S. and Tjian, R.: Control of eukaryotic messenger RNA synthesis by sequence-specificDNA-binding proteins. Nature 316 ( 1985) 774-778. Efstratiadis, A., Posakony, J.W., Maniatis, T., Lawn, R.M., O'Connell, C.,Spritz, R.A., DeRiei, J.K., Forget, B.G., Weissman, S.M., Slightom, J.L., Biechi, A.E., Smithies, O., Baralle, F.E., Shoulders, C.C. and Proudfoot, N.J.: The structure and evolution of the human /~-giobingene family. Cell 21 (1980) 653-668 Geliebter, J., Zeff, R. A., Melvold, R. W. and Nathenson, S. G.: Mitotic recombination in germ cells generated two major histocompatibility complex mutant genes shown to be identical by RNA sequence analysis. Prec. Natl. Acad. Sci. USA. 83 (1986) 3371-3375. Gubler, U. and Hoffman, B.J.: A simple and very efficient method for generating cDNA libraries. Gene 25 (1983) 263-269. Hansen, N.E., Karle, H., Andersen, V., Malmquist, J. and Heft, G.E.: Neutrophilic granulocytes in acute bacterial infection: sequential studies on lysozyme, myeloperoxidase and lactoferrin. Ciin. Exp. lmmunoi. 26 (1976) 463-468. Jeitsch, J.-M., Hen, R., Maroteaux, L., Gamier, J.-M. and Chambon, P.: Sequence of the chicken ovotransferrin gene. Nucleic Acids Res. 15 (1987) 7643-7645. Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Ooning. A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982. Maniatis, T., Goodbourn, S. and Fischer, J.A.: Regulation of inducible and tissue-specific gene expression. Science 236 (1987) 1237-1245. Mazurier, J., Metz-Boutigue, M.-H., Jolles, J., Spik, G., Montreuil, J. and Jones, P.: Human iactotransferrin: molecular, functional and evolutionary comparisons with human serum transferrin and hen ovotransferrin. Experientia 39 (1983) ! 35-141. Pentecost, B.T. and Teng, C.T.: Lactotransferrin is the major estrogen inducible protein of mouse uterine secretions. J. Biol. Chem. 262 (1987) 10134-10139. Saiki, R.K.,Gelfand, D.H., Stoffei, S., Scharf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B. and Erlich, H.A.: Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239 (1988) 487-491. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chainterminating inhibitors. Prec. Natl. Acad. Sci. USA 74 (1977) 5463-5467. Schaeffer, E., Lucero, M.A., Jeltsch, J-M., Py, M-C., Ievin, M.J., Chambon, P., Cohen, G.N. and Zakin, M.M.: Complete structure of the human transferrin gene. Comparison with analogous chicken gene and human pseudogene. Gene 56 (1987) 109-116. Valtieri, M., Tweardy, D.J., Caracciolo, D., Johnson, K., Mavilio, F., Altmann, S. D., Santoli, D. and Rovera, G.: Cytokine-dependent granuloytic differentiation: regulation of proliferative and differentiatire responses in a routine progenitor cell line J. Immunol. 138 (1987) 3829-3835. VentureUi, D., Bittenbender, S. and Rovera, G.: Sequence of the murine myeloperoxidase gene. Nucleic Acid Res. 17 (1989) 7987-7988. Walker, P., Germond, J.-E., Brown-Luedi, M., Givel, F. and Wahli, W.: Sequence homologies in the region preceding the transcription initiation site of the liver estrogen-responsive vitellogenin and apoVLDLII genes. Nucleic Acids Res. 12 (1984) 8611-8626 Woods, D.: Oligonucleotide screening of cDNA libraries. Focus 6 (1984) 1-2. Wyman, A.R. and Wertman, K.F.: Host strains that alleviate underrepresentation of specific sequences. Methods Enzymol. 152 (1987) 173-179. Zeff, R.A. and Geliehter, J.: Oligonucleotide probes for genomic DNA blots. Focus 9 (1987) 1-2.

Structure of the murine lactotransferrin gene is similar to the structure of other transferrin-encoding genes and shares a putative regulatory region with the murine myeloperoxidase gene.

The structure and nucleotide sequence of the murine lactotransferrin-encoding gene (LTF) deduced partly by direct sequencing of genomic clones in the ...
2MB Sizes 0 Downloads 0 Views