Gene. 93 (1990) 257-263

257

Elsevier GENE 03630

H u m a n ornithine docarboxylase-encoding loci: nucleotide sequence of the expressed gene and characterization o f a pseudogene (Polyamines; chromosome; transient expression; promoters; secondary structure; recombinant DNA)

Noreen J. Hickok', Jarmo Wahlfoi's b, Anne Crozat c, Maria Halmekyt6 b, Le,.~a Alhouea b, Juhai Jlaae b, and Olli A. Jtnne c ° Departments of Dermatc:logy and Biochemistry. Thomas Jefferson Unicersity.Philadelphia, PA 19107 (U.S.A.) Tel. (215)955.2018 :b Department oJ B~ochemistry and K,wpio Cancer Research Centre, University of Kuopio, $F-7021 i Kuopio (Finland) Tel. 358-71-163049: and ¢ The Population Council and Rockefeller Um'cersity, New York, NY lO021 (U.S.A.) Received by G.N. Godson: 24 November 1989 Accepted: 27 March and 30 April, 1990

SUMMARY

Previous studies have shown that human ornithine decarboxylase (ODC)-encoding sequences map to two chromosome regions: 2pter-p23 and 7cen-qter. In the present work we have cloned the expressed human ODC gene from a genomic library of myeloma cells that overproduce ODC protein due to selective gene amplification and determined its entire nucleotide sequence. The gene comprises 12 exons and 11 introns and spans about 8 kb of chromosome 2 DNA. The organization of the human gene is very similar to th:tt of the mouse and rat, with the major difference being the presence of longer intronic sequences in the human gene. Some of these differences can be accounted for by the insertion of four Alu sequences in the human gene. Several potential regulatory elements are present in the promoter region and in 5'-proximal introns, including a TATA box; GC boxes; AP-I-, AP-2- and NF-l-binding sites; and a cAMP-responsive element. The 5'-untranslated sequence of ODC mRNA is extremely GC-rich, and computer predictions suggest a very stable secondary structure for this region, with an overall free energy of formation of -225.4 kcal/mol. In addition to the active ODC gene on chromosome 2, ODC gene-related sequences were isolated from human chromosome 7-specific libraries and shown to represent a processed ODC pseudogene.

INTRODUCTION

Ornithine decarboxylase (ODC, EC 4.1.1.17) is the first and key regulatory enzyme in the biosynthesis of polyamines, which appear to be indispensable for animal cell growth (Pohjanpelto et al., 1985; Pegg, 1986). In general, ODC protein concentration and the catalytic enzyme activ-

Correspondenceto: Dr. O.A. Jttnne, The Population Council, 1230 York Ave., New York, NY 10021 (U.S.A.) Te1.(212)570-8722; Fax (212) 570-7678. Noreen J. Hickok and Jarmo Wahlfors should be considered authors of equal contribution.

0378-1119/90/$03.50 © 1990ElsevierSciencePublishersB.V.(BiomedicalDivision)

ity correlate with the growth state of the cell and change rapidly upon exposure to trophic stimuli, such as growth factors, steroids, cAMP elevating agents, hormones, phorbol esters and insulin (Tabor and Tabor, 1984; Pegg, 1986; 1988). The enzyme activity seems to be tightly regulated at multiple levels, including the rate of transcription; stability of mRNA; translational efficiency of mRNA; and stability

Abbreviations: bp, base pair(s); cAMP, cyclic AMP; CHO, Chinese hamster ovary; CREB, cAMP-responsive element-binding protein; DMEM, Dulbecco's minimal essential medium; ds, double-stranded; kb, kilooase(s) or 1000bp; nt, nucleotide(s); ODC, omithine decarboxylase; ODC, gene (DNA) encoding ODC; oligo, oligodeoxyribonucleotide;tap, transcription start point(s).

258 human ODC eDNA (pODCI0/2H) indicated that th~ myeloma cell-derived clones contained intronic sequences, whereas the hybridizable fragments in chromosome 7 libraries were similar in size to those of ODC eDNA. The fact that the pgODC/23 clone originating from the myeloma cell library was indeed an active ODC gone and included its promoter sequences was conf'n~ned by transient expression studies. ODC-deficient CHO cells transfected with pgODC/23 DNA had enzyme activities of 89 + 11.8 (S.D.) pmol/106 cells (n = 3), while the comparative activities in cells transfected with herring sperm DNA were 1.9 _+0.15 pmol/10e cells. Partial restriction maps of the two genomic ODC gone sequences and that of pODC 10/2H are depicted in Fig. 1. The human ODC eDNA clone that we had previously sequenced (Hickok et ai., 1987) contained 87 nt of the 5'-untranslated region, while primer extension studies indicated tb,t the ODC mRNA cap site resided approximately 250 nt upstream of the end of the eDNA (Hickok et al., 1987). To obtain additional sequence information for the 5'-untranslated region, direct sequencing of myeloma cell RNA was ~:arried out with a primer adjacent to the 5' end of pODCI0/2H eDNA. The sequencing of 120 additional nucleotides of the 5'-untranslated region permitted identification of the first two exons of the ODC gone (Fig. 2).

and posttranslationai modifications of the enzyme protein (Tabor and Tabor, 1984; Pegg, 1986). A critical role of ODC activity in determining cellular growth rate is implied by the fact that more thmt one of these mechanisms may operate in response to any given stimulus. To elucidate the mechanisms governing the regulation of ODC gone expression, we have previously reported the cloning and nt sequence of a human ODC eDNA (Hickok etal., 1987). This eDNA wa~ used to identify two ODC gone loci in the human genome, which mapped to chromosome regions 2pter-p23 and 7cen-qter (Winqvist etal., 1986). The purpose of the present study was the determination of the entire sequence of the expressed human ODC gone isolated from a myeloma cell line containing an amplified ODC gone on chromosome 2 and characterization of the second ODC locus on chromosome 7. RESULTS AND DISCUSSION

(a) Isolation of ~he human ODC genes The genomic OZ)C clones that we characterized originated from a hum:~.,, myeloma cell DNA library with an amplified ODC gone (Leinonen et al., 1987; HOltta etal., 1989), or from two human chromosome 7-specific libraries. Initial restriction mapping and hybridization with the

1 ii

(m) q['z ,, %

11

t f

% %

% %

,-, 0 , ! kb

%

,,"

IIi

% %

%

\

(b)

t tt r-.i..l / /

-, ,J 8'

I

I

II

,,, /,,,,, ;. ,/- o,, I

I

I

,, II ~'

•---. 0 , I k b

(e)

,,



,

Fig, I. Partial restriction maps for the active ODC gene (a), ODC eDNA [pODCI012H, (b)] and the ODC pseudogene (c). Exons and introns in the ODC gone are identified by open boxes and solid lines, respectively, The dotted line at the $' end of ODC eDNA corresponds to the sequence not present in pODCI0/2H. Clones corresponding to the expressed gone were isolated from a myeloma cell ~nomic library containing amplified ODC sequences (Halmekyt0 et al,, 1989; HOltt~ etal,, 1989), while the pseudogene clones originated from EcoRI- or Hindlll-diaested chromosome 7-specific DNA libraries obtained from American Type Culture Collection (Rockville, MD), Positive clones were plaque purified and subcloned into the BamHl site of pBR322 (the active ODC gone, p~)DC/23) or the EcoRl and HLndlll sites of pGEM-3Z (pseudo~ne clones),

Fig. 2. The nt sequence of the expressed ODC gone in the human genome. Exonic and intronic sequences are specified by capital and lower case letters, respectively. The cap site ofthe mRNA (also t~p) is identified by nt + I. The location of four Alu repeats is underlined. The shotgun sequencing method (Bankier and Bah'ell, 1983) was used to determine the nt sequence of the Thal-$all fragment of ~.gODC/H2. The missing sequence (about 700 nt from ~p) was obtained by sequencing the fragment extending upstream of the 5' EcoRl site to the Kpnl site (H01tti etal., 1989). Both strands of the gone were sequenced throughout.

259 -387 -360 -370 -180 - 90

8888eeK888 8Kc88KKaK8 8KeteKKet8 eeeKKeeKKa

eetKeKKest KcKKKKaSaK aeste88eee aecKatest8

eKtKeKeest KeKKccscaK KeeKseseee KetKstttKa

8esKsaeea8 KKcKSKSaKK eaeeaestee KetKstsest CC~CC~C~C G6C~GGGGC~ T~AGstsasK KeKegeeteK KetKsataa8 tse88eeeeK KtKeagKsKe etKtKtetae tKetegKeea aKKKKaKaea teaKKaatKt tteaaaKttt 8aaaKaeeKa gegtattaaa aetseatsa8 KcKttKttac aseaaattet aaeeattKee ttaaasKtta KaKtKatstt asttKsestt teeeaaeaet etaaaaatat aaeeeaggag

tteea88esK KeKSKKCK¢8 KeKe88Geee eteeatKKe8

8esaKaeese aaKecKKKK8 888ttKceae aeeegeesKt

aKsetct Kea888e888 cSKSKKceae eseeKseeee 8etataasta

O~ACTCAG6C 8 a c t ¢ ¢ e ¢ K 8 c e s e s K a K K a &EKSIKKSa8 eeKgeetKeK gagaeaegtK KteseeKase KKtKatee~g gegeetestt etKeeeeegt eseesKee(e ttteeeaaee etteEKeKKe KeaKaKectg gsetcKeett KKtaeaSaeK aKaKeasat 8 acaeeaattt 88aaaeceKe tgaettasaK Ktgtttattt ~lsKaetst8 KeeetaaKKt ttKeaKaaaa KKetKattKt ctKKKKagga e a a a g a a t K t K t t K K e t K a e ettetaaeat KtttKlatta geeectatet aaKKtsttet teaKta&aat atatteaaKt aaKaaateaa aaettteeaa KtttasKeet eetatttatt attaaattKt atatattett tetetteeet teteeKaKKK KtKatttaKK atseatsggt ttKsatttat 8aeasKeee8 tKKectetKt KeeeKtaKse tKet88eaet aaataaataa ttttaaKaaa 8et88tteae ettKettstt tttetttttt 8Ktsaastta eeeKsaatt8 etaaaettsK taKseaacSe ttgggagget gaglea|gea gateaeetsa aaaattagee aggeeeeggK tgtggtltea geggag~ttg atgtgageeg agateat~ee ll&l&&&&&t actgaattet KateaKgtaa eaKeaaetgt aaateteaaa aeee&seta& t a e & t K a a a a t t a a e t e s t a KtttstatK8 eaaestKstg ae t t a K t s K a a s t a s K a a a t ~aee~ag~e~ g~t~gateae etgg~tea~ eaetttgEaa eeealetget g~Kgaggetg aggeagga~a elteetgtaat itetttatae tetgteteaa t,,eac teeaK eetltltlae ataaateaKa eeattataa8 aettttttee etKetetsaa aatsettagt KtgetKeett t t t t t88tae aaatatttKa ettatgeeta Ktaataatta KsatKaeea8 aateatstae atetteaaet esatstgtte etaaggtaSa eeastseeee ACACGCAGAO ¢tlC1¢¢¢i¢ 8¢¢8C1~8tt i1¢C¢¢i1~1 astatteast asAGGAACAT CAAGAAATCATGAACAAC'FT 8ttseattte AGGACATTC'r GGACCAGAAA A ~ I ' A A T G A A G ~ ttKseaasgt etaasSSaSa 8eaatga88e gt88&sasct tsKtsataet eatateast8 8etsKaetea atasts&tas tgstsagget getagtgaKt eaeetgtsse gtstKstst8 AGAAACATCTGAGGTGGT~A AAAGCTCTCC GACAI~"FAA G'FGAAGACCC ~ A C C g ~ G A C A G G A T T I ~ A C T G ' I ~ eeeeaeKa88 eaKataeaa8 ttKtsttttt 800t88tatt tstteeae8t 8ttaaseeae aetteeeat8 CCAGAGAGGA3~I'ATCTATGC TGCAGAGTCT AGTGAAGTYG AGTTC~TCJ~ TCCAGATCAT G A ~ A T saseataasa tatstsBatt ettateaile 8esagate88 ttsasteat8 aasect8881 etessaeta¢ 8&&&ccc¢cl t s a a e a t e t a t88etsts&| TC~r~A~T~[~~ tttsets¢&¢ CCAOOC'I~,AG CAGTC'FGTC~ TT~GT(]TCAG 8 t s a s s t t t t 88t888sta8 etasa88tea ~ ~ gtetetttlt a t t t i, aal;~a i t l e t l i 8 A r TTTC~ACAT~G GTGCACGCAA ~ATGC at88esstt8 ase t &st e t t a s t t e t a s s 8 t t & a e t t t t t tt&ttttctt 8. ttetasGC . taset|t88s ~ ~ ~ • A A A T T r G A A G A Gstasttt& GATCTGAGC~k 5041 tts&etssts eesesltK88 tattte&ssl ttt&sstste 6131 8 t a s e e s t t e t&atasltte ettttt|gaa 88ttttetns sststlast¢ 8stseesttt 61|1 atttttssst 8atttettt8 tsetltetss e&ttseaett ttttttlttl ¢eq~llg~g g t g t t t t z t e e t t 5311 a t a a t t K s t t aeattlsasl teatgsetel eaeetstsat tai&i~ti[e et~ltlt~lt lliiliittt eatetetaea 6401 z a z a e e a ~ e e tgKleale,ae a~t~alaeee ~ c t s e a s t g a ~eest~ates El&Kit tits ettlageeea 6491 z e t a e t t a l a aagetlslgt ll~sRgatea &&&~&gl&&& a l e ~ t i t t e e t ~ K c - - ~ " e t s s i i s s t & 6681 l e t ~ e s g a l t g a g a e e e t Kt ¢ t e a s a a a a a 8sa&tsttts 8tetcttlsg taaastsett tteetctcet 5071 e a e e c e a a a t 8tsetttset aKseetatse 88essaats8 8ltttttlte teatsaesta tttsttasaa 5701 t g e t t s a a t t tstaaastat ateeaeats8 8ttetttsat saatatettt 8 t t s e s t a e t 8tast&stts 5851 a a s t s t t t t K aeatett~&t 8eetteets8 8 8 t t t t e s s 8 8asatEt tss t s t s t a s e e t e e s e c t s 8 8 s 5041 a t a 8 8 8 t s e e tssagsgsaa tsaaaaega8 as88eee tst 8~tsttsete 8teatsteae 8teteatsat 0031 t t t a a t s a t a aaeetstet8 ttsstseet8 ATCAACCCAG C~TFGGACAA C A ~ A tetettasAT 0161 i t s a a 8 8 e t 8 etetttettt tetgttteae ACGCTTC~AG ~I'A~TATCAT A T C A ~ ~ ACTATG~['II~C 0~11GAGTGAGAAT CATAGC'lI2JLG CCOOGCAGAT ataaetssas 8e t&atlea.~ esetteatst aa88aessat 8301AGGAAC&GAC~AT Ggtatgtata G6(~'I%'I'ATG 6391 t e t s e e t t t e tasACGAAGATGAGTCGAGT GAGCAGACCT TTATGTATTA T G ~ T G A T t a e t s t s t a a ale~sttss8 6481TA~[Y'~ACCACX~ CACATGTAA~GCCCC'FFC'I~ CAAAAGstaa t t t e t s a g e s t t t a t a t a s A GACCTAAACC AGATGAGAAG TA3"I"ATTCAT 6671 t a a ~ t a a s t a 8taetteete teteestete T G ~ T TGGA~ 0 0 0 1 T G T O A ~ T C C ~ T C O G A T T G ~ F C ~ G O ~ TG'rGACCTC~ ~ T G C A 6761ACTr~,~q~"I~ ~ A C GTTCAATGGC ~TCCAGAG~C CC~(~.~TCTA CTATGTGATG T C A ~ t a s K s s s t t8 8tscttett8 tetssaeKsa tettssatst 0841 t s t t s s t 8 8 t 8etseeaass stssseaeet ttsKsstata stsastKtst ttetststas 8assasaet8 6931 a t a t s s s t a t tttteetaa8 saaastttta 8ettsettt& 8isstaestt 8aattsseea etssasaaae 70~I e t t z t 8 8 t t t eettattaee aaaaesstsa tgggeaaeat tgsgcce&sK agttegsgee 7111 a a s t c a e a a e aeatt~sKai Keeaagsea8 &sggateset l e g t~&gmgll tagteeeage tsettggets tgglegeetK 7~01 s a a a a t ~ a s a aaattsgttg geeatgSta8 ¢¢ tltc tel.r e e t g a g t a a e i i l l i l l l l g t S t a t t e e & 8 7~91 t g a ~ e t l e K Ktgagetstg stageaeeat atstseeset 8818t818{e 1888 {{{{{I tseee&stta 7381 a t t S t t t e e t EatSSssStS ast&etete& CT~CCI~TG~ AGGAACAGGA TC~CAGCACC 7471ATC.~AGCAATTCCAGAACCCCGAC~rCCCA ~ G T A G TA~AA A ~ A ~ 7501ATW.zAAACg;~D~ACAGAGCAGC~-I~-ix~,-rtx.x~ OCTAG'rATTA A ~ T A G A T TTTGAAATGT CTYI~TAAGA G T A ~ 7661AAGGGATFI~GGGGGACC&TGTAAC'ITAAT T A ~ A G T 7741GACTAGGATAT(3GGTCACACTTATCTGTG'F TCCTATC~A ACTATTT~AA TA'ITI~'[-IT[ ATATGGATYT 7831CTACTCAAGAGTGCCCC'FCAGCTCJ~FGAAC AAC~AT'J~T A O C T T G T A C A A T O O C A G A A T GGGCCAAAAG sKaeastte& tstteattee Attssgac8 t ttttstgKt8 7021TTAAAATAAAGTA~ATAA~I'AGGC 88tagsasga eaggtgsgae 88aaseeees ece~c&~&el 8011 e t s a t t e t e a 8agaaesat8 a s t 8 8 8 t e e a etsKtsKts8 saseteett8 teessagect ssttsgetme 8101 a e s t e t g e t e aseeeaeeee aeatKtees~ c&SiSiCeit Keetsgeet8 ssgtstsg8 & 88eeetttsa 8191 s e e t s t e e t 8 teeassees8 tseee~ssss a &attclse~ 8 8 e a s t c a g c t t s t t t t s a t taceacattt 8281 a t K e c t g t K & K a a K t t a K g a a t g t a t a c g K egtgtae~ee ttstgtgtte ttttaeteaa aecaetsgee 8371 t g t g t a t e t t eaeeatett8 tctgaeeatg

~X~GO~GGAGCCAGG(X~T~A~ 91 181 AT~GC, C ' / ~ C C T ( ~ G ~ C C ~ C ~ ( ~ C ( ~ ' I ~ 271 8etKeKKsee eKsKeeeKsg eaeKtKtKeK 301 cKetteetec cKKe¢¢KSKg t t c t e e e K e K 451 KKKKeetaK8 K K a K a e e e a e e e 8 8 a K a e e e 541 KttKKKKaKK e e e t 8 8 e K K e eKeKeaKeaK eetteaKttt eetteeastt tttatttteK 631 7~1 tastettget 8tastasetK tgatattaKa 811 aaasettsKe ttasatsaae 8Kaeataeae 901 aattKstsaa aKtKtaKaa8 8easaaeete 991 88easaettt aattsattt8 tsaaattttt 1081 t t K a e e a t t t taaataa©tt asetKttaea 1171 t a a a a s a t a e aKaaaaseaa aaaaKtattg 1161 t t a a t a a 8 8 e eteaaatsae eeaKtecKa8 1351 a e e a t s t e t 8 taaeaaaaat 8KttttKcta 1441KceaaseaKt eaeaatasat agasetttaa 1531 t a e e e e a t s e eaKaseaaae tststeeees 16|1 aaasaaSaaa 8etetaeet8 8easaaattt 1711 a s e a a a a t & ¢ a s t K s e t t t t attsttaeta 1801 e t a a K e a a a t aeEasastKa etst&aKKae 1891 e t a s e e s a g e gt~t~gete aegeet~taa 1081 e e t g a e t a a e atEKaKaaae 8eeateteea ~071 e t e g s g a g a e tgaggea~ga gaategettg |101 seaasagtaa aactetKtet eaaaaaaaaa 2251 e t t s a a K a t t &eagttttta asaaKtatat 2341 t s e e t K t t K 8 teastaaaaa teattetaaK 2431 a t K t s t a t a a aKzetata&t ~taateeea~ 2521 e e a e a a a a a t tageegg~ea tggtggea~ 1611 a ~ a ~ t t ~ e a ~tgageeaa~ attgeaeege 3701 a t K e t a t t t t aastttetaa 88aaetsaaa 2791 8 a t a a K e t t e tsaaaettse atsetasatt 2881 8 t t s t s t e e t tataetseea a88tttat88 3971 ¢ 8 8 8 8 e a t s a t t a s e t e t s e 8tstsetca8 3081 t t e t e a 8 A T T ~ ~ T T ~ C A A f J { ~ C 3151TT'rcAATI[~'~ATCTCTTAGTTTTCCATsta 3~41 a a s t s a a t t t eteeaeteta tttseatttt 3331GCCACTTCC~OGATGAAGGTT'I'FAC'FGCCA 3481 g c t 8 8 e a K t 8 e a s e t s a s a 8 tseea88eaa 3511 t s t a s a a a t t aatteaetK8 tKstaaatta 3601 & e t s a t 8 8 8 a a t s a a 8 8 t ¢ 8 e t88etatt8 3691AAGGATGCCTTCTA~AC~CCTGGGA 3781GCAGTCAAATGTAATGATA~ CAAAGCCATC 3871 a S e i 8 8 e e t e aaa&sestt8 tata&a~t88 3961 a a | s e a a & t 8 888e88~88 8taeatssea 4061 e t e t t e t t t t tessA~T~AAATACAG'F'FC#3 4141CTCAAATYAAGTA~AATAATGGAG 4|31AAGCAAA~tB a s t t a t t e e e eeatetsa88 43|1 t s t a t t t e t a tletttasts ~&ssKtsstt 4411 ~ S ~ E&ttEttttt Itttts&tt8 4601 TGOP.J~T GA~'FCt~,A~AG 4691 A A ~ A G A G C T A A A T AT~GA~TFG 4081 t t t a e s l l e t tteteetast |tttseta,t G'YAC~C~TCCTGAGACC'rlPC 4771 ~ A A ~ '"' 88e888*e

8eesttcas¢ tgce8¢888¢ 8e888Ke8a8 KeE8e¢8ra8 8¢St8888ca 88cKstsct¢ eseeeeteee ce88e88tst 888aEcEKe8 tsccstg888 CA~CGGCG'I~XCC TOCGG~WI~F G C I ~ X X ~ C ega888¢888 a8¢8888c88 KKseca¢sae ettsaKgesc ettcaeaset eSK88etKga es¢sesctsK c©8888a8¢¢ aSeKKBecee KKcettKBe8 KmSaKt888t a K a K c t a a s a aatEaeteE8 tKatttesKa KctstttKeK aaKteEaaat tt¢Ktttaaa eataatatt8 tateetettK tEetaattKa aasttaetta astaaeKect 8aaeattKee ttaaaaatat catataaaea aaaatatata eaaetteSsa 8Kttsaeaat teaeeetKsK ectKteataK 8aastssstt 8eaeastKKa aaKKtKeeae atttsatsaa 8eeeeaaatt atteteata8 tssttta&sa ataetKaKt~ KltegggaRt ttgataeeat eatgeeKita ateeeageta attteaetee ateettttea aataeaatst Kataasttsa tEeteeaKae attteeatKa aaateetKte tetaetaaaa gagtttlasa eeeaeetRKa ategettga~ eeeRggagtc aaaaaataaa aaastetata atlt©astsa 8ctaastKca 88aaaaaeet 88tatttttt 8eeastKa8 t 88tteatatc eascaaatte ¢etattttat TC.ACOLAGC~GAAATAT~TC attststtKc taaKsettrt TG6TAATGAAGAG'ITFGACq" Tstaastata tEasSeeeat ttacatstct tSttatSEaa aaaSaSea88 atteattSt8 eaetettStt teaEGATGAT CTC(WG'I~ACCCCCTTTTAT CTAOCAAGKt a a g e s a t a s e tK88eaata& atscteaeta ~ ~ eatgacaee. AAACAAGTAT AC~rC~C.AGA GCACATCCCA aaaettasat ttetssttat z s s s t s t s t ¢ sS¢SaClllllllll~t~ ttteett8et AACCAGCA~ tCtseeees8 ~8¢ss8tat trtstat! AGG~T ilieliliet ctliiltKte &sctttcees

s¢Stsacect aeatsssttt TTC~CAT~,r 8tlsttcl¢& ttlitltSit 8&iSteltI8

¢ClLttlII¢

~IIIII~|~

~

zteltzesee tttsaoeeea tseesetsea eCeaaeets~ tsetssetSt sseseateat ss8¢tsssee s t t s t a t t 8 8 etsttssest EtstttSttS eeeatiaeat eetsaaSSS8 tsttttttt8 ettttsttse eeeeatstSa sttteeeacs ATA~TCAGAC'YCTG T O C C A A G A A A A'I~IY~TATYAA assSt, e a t t a a S a t t S t t S a GATCAI~'rAA ~F~CATAC'i~ a~saetsKte seiseaegt8 CCAC~ATATOOGOACCAACA T T G A A A A C A T GGOO0(~'rAC C~TGStaaKt asseestges tettseetsa 8attasatat tttssKttea tKeestst8 t 88tstset88 etcseaeetl as.tge.Kaeee t g t e t e t a e a ¢ttgettlaa ectazlazzt aa~aaaaaaa aatagatt8_ ttttetttet eaSt~CAACTC CTT~'[GCCT~ ( ~ G A ~ CTOCAAG'FFrAO~Tr C A l I G . A ~ A G fX~ATATG6AA T T A ~ A C ' r c T'r~'-~ACA~7(~, CI"IrAG'II~TrG ~,A(XJI~[;rrF csasteeest ~taseteset teeteeeat8 ~ggseEstge 8aeetsetta ~stetsaeK8 eeststeeee teeeaeett¢ seagesagte ectaceat8 t egeteteeee e~sa

260 (b) Structure and sequence of the active ODC gene The entire nt sequence ofthe expressed human ODCgene is shown in Fig. 2. The gene is 7951 nt long from tsp to the poly(A) addition site. It comprises 12 exons and 11 introns, with the translation initiation codon residing in exon 3. The intron-exon junctions all conform to the gt-ag-rule (Aebi et al., 1986), and the positions of the exons with respect to the protein-coding sequence are identical with those in the mouse and rat ODC genes (Katz and Kahana, 1988; Wen et al., 1989). However, the introns of the human gene m'e consistently longer than those present in the rodent genes, ranging from a few nt up to I kb in the case ofintron 1. The differences in the intron sizes are largely accounted for by the insertion offour Alu repeat.s into the human gene, which show the highest similarity to the 'old' ,~abclass (Deininger and Slagel, 1988). Two of these Alu sequences are located in intron 1, one in intron 8 and one in intron 11 (Fig. 2). In contrast to the high degree of homology between the human and mouse ODC eDNA sequences (tiickok et al., 1987), the intronic sequences of the two ODC genes are less well conserved. There are, however, selected intronic regions that exhibit nt sequence identity of over 60~o. In the human gene these include: intron 1, nt 890-1216, 1366-1642 and 2696-2860; intron 4, nt 3973-4023; intron 6, nt

4630-4679; intron 7, nt 4902-4973; and intron 8, nt 6345-6403. It is of interest that, apart from the two Alu repeats in the human gene, intron 1 which contains some potential c/s-regnlatory elements (see below) is the most conserved among these intronic regions. The tsp for the human ODC gene was determined by S 1 nuclease analysis using two overlapping oligos (Fig. 3). The results indicated that ODC mRNA transcription is initiated at a C or a T residing 341 or 340 nt, respectively, upstream ofthe translation start site (AUG). An identical location for this site was determined by primer extension studies. The assignment of the tsp by these experiments is in close agreement with our previous estimate for the length (335 nt) of the 5'-untranslated region of the ODC mRNA (Hickok et al., 1987). A consensus TATA sequence is present 28 nt 5' of tsp. No CCAAT sequence is present, although the CCGAT sequence commencing at nt -79 may serve this purpose. Multiple (and sometimes overlapping) potential Spl-recognition sequences are found in both the proximal 5'-flanking sequence, the first exon and the first intron (Table I). The contribution of these sequences to the promoter activity of the ODC gene is currently being evaluated. A putative cAMP response element binding to the CREB protein,

OLIGO: S 1-1 C+T C A G

1

OLIGO: S l - 3 2

3

3

2

1

G+T G A G

Fig. 3. Identification of the 5' terminus of the ODC mRNA by Sl nuclease analysis. The end-labeled oli8os Sl-I (nt -21 to + 26) and SI-3 (nt -7 to + 35) were hybridized with poly(A) + RNA from human myeloma cells (Williams and Mason, 1985), treated with S! nuclease (Hall et al,, 1989), and the protected fragments analyzed by electrophoresis on 5 ~ polyacrylamide.8 M urea gels, Ladders of the two oligos (reading the opposite strands) are shown adjacent to lanes I-3 corresponding to the S i nuclease-treated samples, Lanes: !, human myeloma cell RNA; 2, undigested oligomer; 3, tRNA. The double bands identified by arrowheads in lanes ! specify the nt of tsp.

261 TABLE I Location of recognition sequences for transcription factors in the human ODC gene Transcription factor" [Consensus sequence]

Sequence in the ODC gene

Location b

Spl IGGGCGGI

GGGCGG

-297; -271; -262; -241; -232; -213; - I ! ! (5'-fl); +37; +94; + 126 (Ex !); +254; +265 (! I)

CREB [TGACGTCA]

TGACGTCG

-172 (5'-fl)

AP-I [TGACTCA]

AP-2 [CCCCAGGC]

CGACTCA TGACTTA TGACTCG TGAGTCA TGAATCA CCCCCGGC GCCTGCGG ~ CCCCAGGC

+ 151 (Ex I) + 761 (I I) + 793 (I I) +3656 (I 3); +4362 (I 5); +7425 (I II) + 5285 (I 8) -103 (5'-fl) + 314 (I I) + 1533 (I I)

NF-I [GCCAAT]

ATTGGC':

+ 1197 (I I)

OCT- I/OCT-2 [A'ITTGCAT]

ATTI'GCAT

+ 3260 (I 2)

Estrogen receptor [GGTCANNNTGAC~]

GGTCACACTTATC

+ 7753 (Ex 12)

a Adapted from Mitchell and Tjian (1989). N, any nucleoside. b The number refers w the first nt of the sequence. 5'-fi, 5'-fianking region; Ex, exon; !, intron. The recognition seque~tce is on the opposite strand.

identical with that present in the c.fos gene (see Mitchell and Tjian, 1989), is found starting at nt -172. The expression of the human ODC gene has been reported to be regulated by phorbol esters at the transcriptional level (Hsieh and Verma, 1989); a total of three perfect AP-l-binding sequences are present in introns 3, S and I I of the human gene. Moreover, four AP-l-binding site-like sequences with one nucleotide mismatch are located in exon I (I site), intron 1 (2 si~es) and intron 8 (I site). Again, whether these and other putative transcription factor-binding sites are used to regulate ODC gene expression in vivo remains to be elucidated by direct experiments. When the 3'-flanking region of the human gene was examined, an additional potential polyadenylation signal AATAAA was found at approximately the same position as in the mouse ODC gene (Hickok et al., 1986; Katz and Kahana, 1988). Although we have never detected a longer ODC mRNA in human cells or tissues, the presence of this distal AATAAA sequence raises the possibility that, as in rodents, the human ODC gene may code for two mRNA species.

(e) The 5'.leader sequence of ODC mRNA Human ODC mRNA has a very OC-rich S'-untranslated region (Fig. 2). The FOLD program predicted that this region forms a complex stem-loop structure with an overall free energy of formation of -225.4 kcal/mol (Fig. 4). The extensive secondary structure suggests that translation of the mRNA is very inefficient, as only the cap site and its adjacent nt are available for binding to the 405 ribosomal subunit. In other systems, ds regions within a few nt of the cap site have completely inhibited translation (see Kozak, 1989). The AUG start codon of the human ODC mRNA is also contained within the stem-loop structure (Fig. 4), possibly limiting its availability for the initiation of tr~qslation. The S'-untranslated sequence of the human ODC mRNA is very similar to that of the mouse and the rat, with an overall sequence identity of about 70%. The extent of identity is, however, less than that between the proteincoding regions of the respective mRNAs (Hickok et al., 1987; Wen et al., 1989). Similar to the human mRNA, computer analyses of the 5'-untranslated sequences of

262

|3' 5'

Fie. 4. Predicted secondary structure for the human ODC mRNA. The entire $'-ieader sequence of the mRNA was subjected to secondary structure analysis using the FOLD program (fucker and SteiBler, 198 I). The location of the translation initiation codon AUG is indicated. Small dots mark every tenth nt and a square indicates every 50th nt. The calculated free energy of the entire structure is -225.4 kcal/mol.

rodent ODC mRNAs have predicted the presence of an extensive secondary structure for these regions (Katz and Kahana, 1988; Coffino, 1988; Wen et al., 1989). Int¢restingly, the secondary structure of the 5'-leader sequence of the human mRNA appears to be even more complex and stable than those of the rodent mRNAs.

(d) ODC locus on chromosome 7 Two chromosome 7-specific libraries, containing inserts from a complete EcoRl or Hindlll digestion of chromosome 7 DNA, were screened with pODCI0/2H as the hybridization probe, Of several positive clones isolated from each library, only one size insert was identified, i,e,, a 2,2-kb EcoRl and a 4,4-kb Htndlll fragment, respectively, This sui~ests, but does not prove, that there is only one copy of these sequences on chromosome 7, By restriction mapping and Southern blotting, it was tentatively concluded that these sequences were related to the ODC gene, but most likely corresponded to a processed pseudogene, Although the clones have not been completely sequenced, *.heir partial n*. sequence reveals the lack of introns and the presence of a number of mutations, insertions, and deletions. As illustrated by the partial restriction maps in Fig, 1, some of the restriction enzyme cleavage sites are maintained and some deleted in comparison to the active ODCgene or ODCcDNA. The nt sequences covering exonintron junctions 1, 2, 3, 6, 7, 8, 10 and 11 have been deter. mined for the pseudogene; in all but one case, ~.he splicing has occurred correctly and intronic sequences have been deleted. The only exception is the intron 2-exert 3 junction in which the pseudogene contains a 15-bp insertion (5'-TITGI"I'GTGTTTCAG-3') that has only a 2-nt mismatch with the sequence at the 3'-end ofintron 2. This may indicate that the pseudogene locus has been inserted into the human genome through reverse transcription of a par-

tially spliced nuclear precursor rather than a m a t t e ODC mRNA. It is also ofnote that the pseudogene sequence does not contain nt corresponding to a poly(A) tall at the 3' er,d. Even if the pseudogene locus were transcribed, this OL)C gene-related sequence is not able to encode a catalytically active enzyme, since there is a l-nt deletion in the second codon after the conserved translation initiation codon AUG and two insertions (1 nt and 3 nt) within the first 50 nt downstream from the AUG. These mutations cause several frameshifts and create stop codons in all reading frames for the putative protein-coding sequence.

(e) Conclusions Determination of the entire nt sequence of the expressed human ODC gene has allowed identification of' several sequence elements that moy be involved in both transcriptional and translational regulation. In particular, the 5'-untrans!ated region ofthe ODC mRNA ha, the potential to form an extremely stable stem-loop structure. Furthermore, the 5'-flanking region and the first several introns contain many potential c/s-regulatory motifs; these features are similar to those observed in the rodent ODC genes. Finally, we have characterized the ODC xeric-related locus on human chromosome 7 and shown that it is a processed pseudogene with a number of genetic lesions.

ADDENDUM

After submission oftbe present paper, we became aware of three reports describing sequence analysis of the expressed human ODC gene (Fitzgerald and Flanagan, 1989; Van Steeg et al., 1989; Moshier et al., 1990). The data in these articles are very similar to those in the present paper, with minor differences in the reported nt sequences

263

for the active ODC gene, We have assigned tsp 3 nt closer to the TATA box than Moshicr et ~1, (1990). The sequences reported by Fitzgerald and Flanagan (1989) and Moshier et al. (1990) correspond to an allele different from ours, as judged by the presence of'the polymorphic Pstl site (Hickok et al,,, 1987) in the first intron of' their sequences.

ACKNOWLEDGEMENTS This work was supported by grants from the National Institutes of Health (HD-13541), the Finnish Academy of` Sciences, the Finnish Cancer Society and the Sigrid Juselius Foundation. We would like to thank Dr. Erkki HOltt, for ODC-cleficient CHO cells and Cecilia Liu, Peter Kuhn and Riitta Sinervirta for technical assistance. The nucleotide sequence reported in this paper has been submitted to the GenBankTM/EMBL Dat~ Bank with accesuion nmnber M33764.

REFERENCES

Aebi, M., Homing. H., Padgott. R.A., Raiser, J. and Weissmann,C.: Sequencerequirementsfor splicingof hi8bereukaryoticnuclearpremRNA.Cell 47 (1986) 555-$65. Bankier, A.T. and Barrell, R.G.: Shotaun DNA sequencing. In Flavall, R.A, (Ed.), Techniques i, Life Sciences, 85: Nucleic Acid Biochemistry. Elsevier, Limerick, 1983, pp. 1-34. Comae, P,: Probable cloning artefacts previously interpreted as unusual leader sequences of rodent ornithine decarboxylase mRNAs- - a cautionary tale. Oene 69 (1988) 365-368. Deininger, P.L, and Slagol, V.K.: Recently amplified AI, family members share a ~.ommon parental ,I/, sequence, Mol. Cell. Biol. 8 (1988) 4566-4569, Fitzb~.t'ald, M.C. and Flanu8an, M,A.: Chara~:terization a;Id sequence analysis of the hums, ornithine decarboxylase gone. DNA 8 (198)) 623-634. Hall, D., Jones, S.O., Kaplan, D.R., Whitman, M., Rollins, BJ. and Stiles, C.D.: Evidance for a novel sianal transduction pathway activated by platelet-delived 8rowth factor and by double-stranded RNA. Mol. Cell. Biol. 9 (1989) 1705-t713. Halmekyt{}, M., Hirvonen, A., Wahlfors, J., Alhonen, L. and Jllnne, J.: Methylation of human ornithine decarboxylase gone before transfaction abolishes its transient expression in Chinese hamster ovary cells. Biochem. Biophys. Res. Commun. 162 (1989) $28-$34. Hickok, N J., Sepplnen, PJ., Gunsalus, G.L. and JInno, O.A.' Complete amino acid seque,ce of human ornithine decarboxylase deduced t~om complementary E:NA. DNA 6 (1987) 179-Z87. Hickck, N.J., Sepplnen, PJ., Kontula, K.K., Janus, P.A., Bardin, C.W.

and Jinne, O"4.: T~vo omithine decarboxylas¢ mRNA specks in mouse kidney eJrisefrom size heterogoncity at the/r 3' termini. Prec. Natl, Acad. Sci. USA 83 (1986) 594-598, HOIttll, F...,Hirvonen, A., Wahlfom, J., Alhoueno L., l i m e , I. and KalSo, A.: Human ornithine decarboxylue ( O D C ~ goes: and expression in ODC-ddicient CHO cells. Gem: 83 (1989) 125-135. Hsieh, J.T. and Verma, A.K.: Lack era role ofDNA methylation in tumor promoter 12-O-tetradocanoylpborbol-13*aceta~ synthesis ofornithine decarboxylase messenger RNA in 1"24~ Cancer Res. 49 (1989) 4251-4257. Katz, A. and gahana, C.: Isolationand characterization of the mouse omithine (U~-.arboxylasegone. J. Biol. Chem. 263 (1988) 7604-7609. Kozak, M.: Circumstances and mechanisms of inhibition of trans~tion by secondary structure in e~aryotic mRNAs. MoL CelL Biol. 9 (1989) 5134-5142. Leinonen, P., Alhonen-Hong/sto, L., Laine, R.o Jiinne, 0.4. and Jiinne, J.: Human myeloma cells acquire resistance to diflumometh)lomithine by amplification ofornithine decarboxylase gone. Biochem. J. 242 (1987) 199-203. Mitchell, PJ. and Tjian, R.: Transcr;ptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245 (1989) 371-378. Moshier, J.A., Gilbert, J.D., Skunca, M., Dosescu, J., Almodovar, K.M. and Luk, G.D.: Isolation and expression era human ornithine decarboxylase gone. J. Biol. Chem. 265 (1990) 4884-4892. Pea, A.E.: Recent advances in ~he biochemistry of polyamines in eukaryot~s. Biochem. J. 234 (1986) 249-262. Pegg, A.E.: Polyamine metabolism and its importance in neoplastic IWowth and as a target for ~hemotberapy. Canc©r Res. 48 (1988} "/59-774. Pohjanpelto, P., HOIttll, E. and Jilnne, O.A.: Mutant strain of Chinese hmnster ovary cells with no detectable ornithine decarboxylase activity. Mol. Cell. Biol. $ (1985) 1385-1390, Tabor, C.W, and Tabor, H.: Polysminea. Anna. Rev. Biocbem. $,~(1984) ?4?-790. Van Stee8, H,, Van Oostrom, C.T.M., Martens, J.W.M., Van Kreyl, C.F., Schepens, J. and Wieringa B.: Nucleotlde sequence of the human ornithine decarboxylase gone. Nucleic Acids Res. 17 (1989) 88S5-885~. W~n,L., Huang,J.-K.and Bluckshear,PJ.: Rat ornithinedecurboxylase gone. Nucleotide sequence, potential reaulatory elements, and col,, [arisen to the mouse 8ene, J, Biol. Chem, 264 (I)89) 9016-9021. Williams, J,(3. and Mason, PJ,: Hybridisation in the analysis of' RNA, in Haines, B.D. and HiIBins, SJ. (Eds.), Nucleic Acid Hybridisation; A Practical Approach. IRL Press, OxFord, 1985, pp, 139-160. Winqvist, R., Miikell, T.P., Seppanen, PJ., JInne, J., Alhonen.Honaisto, L., Jlinne, O"4., Grzeschik, K.-H. and Alitalo, K.: Human ornithine decarbo~yl~e sequencesmap to chromosome re8ions 2pter-p23 and Tcen.qter, but are not coamplified with the NMYC onco8ene. ,'.'ytogoner. Cell GeneS,,;2 (1986) 133-140. Zucker, M. and Steiller, P.: Optimal computer E'lding of iar8e RNA sequ~:;acesusing thermodynamics and auxiliary information. Nucleic Acids Res. 9 (1981) 133-148.

Human ornithine decarboxylase-encoding loci: nucleotide sequence of the expressed gene and characterization of a pseudogene.

Previous studies have shown that human ornithine decarboxylase (ODC)-encoding sequences map to two chromosome regions: 2pter-p23 and 7cen-qter. In the...
2MB Sizes 0 Downloads 0 Views