J. Mol. Biol. (1990) 213,399-405

Structure and Complete Nucleotide Sequence of the Gene Encoding Rat Fibroblast Tropomyosin 4 James P. Lees-Miller, Albert Yan and David M. Helfmanl" Cold Spring Harbor Laboratory Cold Spring Harbor, N Y 11724, U.S.A. (Received 3 October 1989; accepted 8 January 1990)

We have isolated and determined the complete nucleotide sequence of the gene that encodes the 248 amino acid residue fibroblast tropomyosin, TM-4. The TM-4 sequence is encoded by eight exons, which span approximately 16,000 bases. The position of the intron-exon splice junctions relative to the final transcript are identical to those present in other vertebrate tropomyosin genes and the Drosophila melanogaster TMII gene. We have found no evidence that the rat TM-4 gene is alternatively spliced, unlike all the other tropomyosin genes from multicellular organisms that have been described. Typical vertebrate tropomyosin genes contain some, or all, of alternatively spliced exons la and lb, 2a and 2b, 6a and 6b, and 9a, 9b, 9c and 9d in addition to common exons 3, 4, 5, 7 and 8. The rat fibroblast TM-4 mRNA is encoded by sequences most similar to exons l b, 3, 4, 5, 6b, 7, 8 and 9d. Two exon-like sequences that are highly similar to alternatively spliced exons 2b and 9a of the rat fl-tropomyosin gene and the human TMnm gene have been located in the appropriate region of the gene encoding rat fibroblast TM-4. However, several mutations in these sequences render them non-functional as tropomyosin coding exons. We have termed these exon-like sequences, vestigial exons. The evolutionary relationship of the rat TM-4 gene relative to other vertebrate tropomyosin genes is discussed.

Vertebrate fibroblasts in culture possess a complex mixture of tropomyosin isoforms. In rat embryo fibroblasts, at least five isoforms were identified. They were named according to descending apparent M, on sodium dodecyl sulfate/polyacrylamide gels as TM-1 (40,000), TM-2 (36,500), TM-3 (35,000), TM-4 (32,400) and TM-5 (32,000) (Matsumura et al., 1983a,b). Cloning of cDNAs demonstrated that the relatively high Mr tropomyosins TM-1, TM-2 and TM-3 are 284 amino acid residues in length (Yamawaki-Kataoka & Helfman, 1985; Takenaga et al., 1988; Lin & Leavitt, 1988), while the low M r isoforms TM-4, TM-5a and TM-5b are 248 residues in length (Yamawaki-Kataoka & Helfman, 1987; L. O. Goodwin et al., unpublished results). Amongst all the tropomyosins for which the amino acid sequence is known, TM-4 is most closely related to the fl isoform of equine platelet tropomyosin and to human TM30pl (Lewis et al., 1983; MacLeod et al., 1987; Yamawaki-Kataoka & Helfman, I987). The physical properties of the equine platelet fl isoform have been extensively

Author to whom all correspondence should be addressed. 0022-2836/90/110399-07 $03.00/0

characterized (Cote & Smillie, 1981b). Relative to the 284 amino acid residue tropomyosins found in striated muscle, smooth muscle and fibroblasts, equine platelet tropomyosin has a limited ability to polymerize in a head-to-tail manner, and requires increased levels of magnesium ion in order to bind to filamentous actin (Cote et al., 1978; Cote & Smillie, 1981a). Matsumura & YamashiroMatsumura (1985) found that a mixture of TM-4 and TM:5 had actin binding and head-to-tail polymerization properties similar to equine platelet tropomyosin. In striated muscles, tropomyosin interacts with the calcium ion sensitive protein troponin, and participates in the regulation of actomyosin ATPase activity (for a review, see Adelstein & Eisenberg, 1980). In smooth muscle and non-muscle cells, the function of both the high M, and the low M, tropomyosin isoforms remains to be determined. Tropomyosin antibodies have been found to reduce organelle movement in chicken fibroblasts (Hegmann et al., 1989). Linet al. (1988) found that antibodies directed to a low Mr chicken tropomyosin isoform bound to stress fibers and membrane ruffles, while antibodies directed to a high M r isoform stained only stress fibers. The 399 © 1990 Academic Press Limited

H

R

I

I

H B I

H

RH

K

H

BK

I

II

I

I

|f

I

2b Ib

3

4

HRB

R

R

I

I

III

5

6b 78

RH KR ~l

90

II

9d

oo° 0

I

I

I

I

I

I

I

4

6

8

I0

12

14

16

I

18 kb

k b ) ATGTTGGCTCCCCTGG~CAGGAGCTGGGAATC~A~CTTAAGT~TCTGTACAAGGACTTGTCTATATAAACCTTTTACTTTCTACATTAAT1 0 0 0

(0.9 Dir

I

2

R GTGGGGGCGCATAC•TACAT&CATACTTATCTATATGTACTTTTTACTT•C&CA•ATCTGTGTCCTCTGAACGAG•CCTTCTCTATAT&G•CCTTTCATT

1100

TTCTACATTAA•GTGGGGGTGCATG•GTGAGGTTAATGG•CTCTGGGCATCAGGCTTGGCGACAAGCACC•TT•CCCGTGGAGCCATCTCACCAG•TCAT

1200

ATAGGCTTTGTGTTCATTTGCAGGACCTCAAACTCCCAGTCCTTCTGC•TCAGCCTCCCAAAGAGTGGGCAAAGAGACCATTAGGCTGACT•GTTCCCTT

VE 2b C T G T G T C C C C G A T T ~ T G A T G T A C C T C ~ T G A A A T A C A A D ,

GAD,G A C G A A G T ~ T C C G A C ,

1300

GACCTG_z_~_I~TGA

AGAAGCTAGAD'~AGGc~GCCTCCCAD~TAAGTGCGCGACCGCTGGGCCAGGCGGGGCGGGCTGCTCCGGAGGACGAGGAGcCTGCAGGG

1500

CAAAGGCGGCCAGCCATGCCTACGTCAGCCACCCCCTGACCCCCCGCCcGGTGCTGACGTCGCGGCCGGCCGGGGTGACCTCATCGGCCCGAcAGCAGGG

1600

CCGGGGGGCGGGGAGAGGCGGGGGCGGCCCCGGCGTGGCCAAGGCTCGCGGGCCCGGGCGCAGCCGCCCAGC M

A

G

L

N

S

L

E

A

V

K

R

K

I

Q

A

L

Q

Q

Q

A

D

TGTCGCC TGAGCCCAG TGGAGCGAG TGA D

A

E

D

R

A

Q

G

L

R

E

L

D

G

E

R

E

R

R

E

1800

K

GCGAGCTGGATGGCGAGCGCGAACGGCGC. ~ T G A G A G A C T G G G A C C C G C G A C C C C C C G C C C G A T C C C A G C T T T C G T C C C G C G C G C T C C C T A G C C

T

(i' 3 kb) TACCTTATAAGGGAAACCCTTTGTAGCCCGTGGAAGGCTTTGATTGCAGGAAGGGGCGGAGCTcAAGATGACTGTGTGTGTGTGTGTGTGTG

3400

(i '2 kb) GTTCCCTTGAGCCTGGGTCCCTGGGTGGGTTGTGGCTGGATcTCAGGGTTACTGGGGGGTGGGGAGGAGGCTGCTTGCCAGCTTGCCTCTTc

4800

G

D

A

A

A

L

N

R

R

I

Q

L

V

E

E

E

L

D

R

A

Q

E

T TAATC TGGG TG TGTATC TAAC T T TCCAGGCTGAAGGAD, A T G C G G C C G C T C T C A A C C G T C G C A T C ~ T G G T G G A G G A A G A G C T G G A C C G G G C T C A G G A R

L

A

T

A

L

Q

K

GCGAC T G G C ~ C C T G C A G A A F ,

SR 2

L

E

E

A

E

K

A

A

D

E

$

E

4900

R

CT~TGCTGACGAGAGTGAGAGGTAGGATGCT

TC TGCC TG TGGAAGAC T TGGGGGCGGA

6200

ATATGTATATATATATGTGTGTGTATATATGTATATGTATGTATGTATGTATGTATATACATACATGAGGTTACTCAGTAGGTAAGGGCATTTATTGCCA

6300

M

K

V

I

E

(2" 1 kb) TGTCCTTCCCTGGGTGCTGGGTGGGCGcATTTGAGGACCTTGCCGTGATACCCCACCCTACCCCACTTGCAGAGGCATGAAGGTGATAGAGA

Ex 5

SR 3

N R A M K D E E K M E I Q E M ACCGAGC CATGAAAGACGAGGA~AAGAT G ~ _ - A ~ A T C ~ G C A g ~

Q

8600

GAGTTGG•CTATGGGATAGC•GCAGGAAGTGTGAGGAAATGGTG•TG••GATAGGAGCC•ATCGGAAGCCCAcACTCAAGACGATTACCTCTGTGGGAAc

8700

(0" 8 kb) CTTGCGGACTcCGGAGCTCTTGGATTTGGAGGGACCAGcGTcTGTAGGGATGGATGGTAGTGGGCATGGGTCTCAGTTGGTTCTTcTGAcTT

9600

V A R K L V I L E G E L E R TCAGGTTGCTCGTAA~TTGGTCATCCTGGAGGGTGAGCT~GGC,

&

E

E

R

A E V S E L CGGAGGTGTCTC,&ACTG TAAG TGGCAGAACGGGCCTGAGC T

9700

GCTCATCTGGCTCTAGGCAGAGGACGGGTAGTGGGCTGTGCTGGCTGGGAACTGCTTGTGGCTGGGGCCTCTGAGGCTCAAGAGTTTCAGCATTcAGAGT

9800

(0.I kb)AGTTGAGACCCCAGGACAGCATAGAGGCCCCAAGTTTCCTTCTCCCAGcAAAGCCTACTGCTGGGCCTTCCTTCCTTCCTTCTTTCTTTCCT

10000

TTCTTTCTTTCTTTCTTTCTTTCTTTGTTGTTGTTTTGTTTTGTTTCTTTGTTTTGCTTTTTGTTTCTTTGAGAcAGGGTCTCATGGCCTGGAATGCCTG

10100

(1"1 kb)CTGATTCTGCCTGTGCCCTGCAGGAGAGGGTAGGATAAGCTCCCAGCCGTGGGGGTTGGGGGGGGTGGACTTGCTGGAAGGGAATTGCCTCC

11300

K E

S

$

D

L

E

E

E

L

K

N

V

T

N

N

L

K

S

L

E

A

A

S

11400

K

TGAAAAGGTGGGATGCAGGGAAACTTAGTACATTTTGTTCTGTCCAGGTGACCTGGCGTGTCCCAGGGTATTGGTGTGCAGGCCTTTGTCAATGGGAATA Y S E K (0" 3 kb) A C A A G A G C C C T T A A C C T A T C T C T A G C C T C T T T C C C T C T T T T A A G A A C C G C C T T T T G G G C T T T C T T T C C T A T A G T A c T c T ~ T A K

Y

E

E

E

I

K

L

L

S

D

K

L

K

E

11500 D 11900

E

AATATGAAGAAGAAATCAAG~TT~TGT~T~T~TGAGCGACCGCACCCAGCTGCACCCACATTTTCGGGAGCTAATCTAAAACAGGGT A

EX 8

8500

L K E A K B I A E E A D R K Y E E TCJLEAG~EAF~2CAAGC.ACATCGC TGAC,GAGGC TGACCGGAAG TATGAGGAGG T

Ex 6b ATGACCGCAAATCTTTTCTGCTTTGGTGCAGAAkGAGTAGCGACCTGGAAF~AGGAGCT~GTA~CTAAcAATcT~-~-1-%TCACTGGAD~GCTGCTTC

Ex 7

5000

(I" 1 kb) CTAGGGACTGGGATGTCGCTCGACAGCAAATACCTGCATAGCATACATATATACATATATGTATTGTGCATGTGTGTGTGTGTGTATG•AAT

G

Ex 4

1900 3300

TG TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTCTTGTGTGTGCTTCTGTTcATGCATAGTTGTGGACGACAGcTCTTTCGATTTTTTAAACAG

E

Ex 3

1700

Q

Ex Ib G c T A T G G C c G G C C T C A A c T C A c T ~ G G T G A ~ F ' C ~ T C ~ C T G C A G C A G C A D ' G C G G A C G A C ~ A c ~ G T G C G C A G G G C C T G C A G C

SR 1

1400

E

T

R

&

E

F

&

E

R

T

V

$

K

L

E

K

T

I

D

D

L

12000

E

AGTGACTCTTCCCTTCCCTAGGcTGi~Acc~G~cTGAGTTTGCGGAAAD~TJ~TTT~TAAACT~T~GATGAc~TGGAAGGTAGGAATC

12100

TC•CATTTTCCTAAGTGACATGAAAGTTGAGTGAGGAGGTCTGTGGAGGGACAGAGCTGTGCTTTGTTTCCAGAGTGAGGTCTTCAGAATTCTCACAGAGA

12200

(0"5 kb)CAACTTTGTCTTGGTTCACACTGTGTCCTCACTAcTCTCGTTTGTCTATTCTCAAcCCTCCATCCTATTTCTTCAACCTCTTTCCTGGGGAC

12800

VE 9a A ~ A C C ~ T G T A G A c G G A ~ T G c ~ c A A A A G C T ~ G G T A A A T A c c T A ~ T G T G A A c C A T T T ~ T c 3 ~ T G A c A c ~ T ~ T C T C T G A G G G C A G T G G G C G T

12900 E

(2°5 kb•TT•ACTGTCACAGCTTGAGCTGACCAGTTTCCCTTCCCTTAACAAAGATGCTTATTAATTC•TTTTTCCTCTGTCTTAATCTTTAcCAGAAA K L A Q A K E E N V G L H Q T L D (~ T L N E L N C I Ex 9d A A C T T G C C c A F ' G c ~ ~ G T G G G ~ T T G C ~ T C A C ' A c A c T c ~ A ~ c A G A c A c T A A A C G A A C T T A A ~ T G T A T A T A A A C C A A A C C A G ~ G A G T C c T G (1"2 kb~TTTAGCAAACTGTGCAATTGATAAAATCT&CTCTGTTCTTTAGATTCTTGCG&TTGTATCATACGTAATGGTATCACTTTTTCTACATTTTG

15500 15600 16900

T GTCAAATAAATTTTTGCATAAATAATGTGTGTGGGTTTACTCGTTTAGTTGACAATTTTGTAGGTATATATGATACATTCTGGTTATTATGCATACACAT

17000

TCTCATCTTTTTTTTTTAAGATTCATTTATTATATCTTTGTGGTTTTTTTTTTTTTTTTTGGTTCTTTTTTTCGGAGCTGGGGACCGAACCCAGGGCCTT

17100

Fig. 1.

Communications concept t h a t specific actin-based cellular structures may require different combinations of tropomyosin isoforms with other actin-binding proteins was supported by the finding t h a t high M r fibroblast tropomyosin alone will protect actin filaments in vitro from severing by gelsolin, while TM-4 and TM-5 require the presence of caldesmon, a calmodulin and actin-binding protein, in order to significantly block gelsolin severing activity (Ishikawa et al., 1989). F u r t h e r evidence of the role of tropomyosins in the organization of actin filaments has come from studies of transformed fibroblasts. When fibroblasts in culture are transformed by DNA or RNA t u m o r viruses, ultraviolet irradiation or chemical carcinogens, their resulting shape change is correlated with a decrease in expression of one or more of the high M r tropomyosin isoforms, and an increase, or no change, in the levels of TM-4 and TM-5 (Hendricks & Weintraub, 1984; L e a v i t t et al., 1986; Lin et al., 1984; Matsumura et al., 1983b). Tropomyosins are known to be encoded by complex transcription units t h a t express multiple isoforms through the use of alternative promoters and alternative RNA splicing. The most complex of the vertebrate tropomyosin genes is the a-TM gene of rat. This gene encodes at least five high M r tropomyosins, including smooth muscle a-tropomyosin, striated muscle a-tropomyosin, fibroblast TM-2 and TM-3 and brain TMBr-1. I t also encodes four low Mr isoforms, including fibroblast TM-5a and TM-5b and brain TMBr-2 and TMBr-3 (Ruiz-Opazo & Nadal-Ginard, 1987; Wieczorek et al, 1988; Lees-Miller et al., 1990; L. O. Goodwin et al., unpublished results). The fl-tropomyosin genes of human and rat encode two high M r isoforms, including skeletal muscle fl-tropomyosin and smooth muscle fl-tropomyosin, which is identical to fibroblast TM-1 (Helfman et al., 1986; MacLeod et al., 1985). The chicken fl-TM gene also encodes these two isoforms in addition to a low M r tropomyosin TM-3b (Bradac et al., 1989; Libri et al., 1989). Clayton et al. (1988) have found t h a t the human TMnm gene encodes both a 284 amino acid residue tropomyosin present in slow twitch skeletal muscles and a 248 residue tropomyosin, TM30nm, which is present in fibroblasts. This gene and its products

401

have not been found in rat. A low M r Drosophila melanogaster tropomyosin is also encoded by a gene with alternative promoters and extensive alternative splicing (Hanke & Storti, 1988; Karlik & Fyrberg, 1986). R a t fibroblast TM-4 and its human homolog, fibroblast TM30pl, are encoded by a fourth vertebrate gene (MacLeod et al., 1987; Y a m a w a k i - K a t a o k a & Helfman, 1987). The m R N A expressed from this gene is 2.2 k b t in rat and 3"0 kb in humans. The rat mRNA has a broad tissue distribution including brain, kidney, liver and fibroblasts. In this study, we have sequenced the region of the TM-4 gene required to generate the 2-2 kb message. We have found no evidence of functional alternative splice choices but have located well-preserved remnants of exons t h a t once functioned as alternate splice choices. Results and discussion The exons encoding rat fibroblast TM-4 were identified by comparing genomic DNA sequence with mRNA sequence (Fig. 1). The position of splice junctions relative to the coding sequence of the final transcript is identical to t h a t found with the other described v e r t e b r a t e tropomyosin genes and the Drosophila T M I I gene. We have, therefore, numbered the exons of the gene encoding TM-4 so t h a t they m a y be compared directly with the exons of other known tropomyosin genes (Fig. 2(a)). Transcription of the TM°4 m R N A precursor appears to begin in a GC-rich region (78% GC) t h a t extends for 275 nucleotides upstream from the translation initiation codon AUG (Fig. 1). Primer extension studies have indicated a transcription initiation site approximately 100bp upstream from the AUG codon (not shown). This site is flanked by three Spl recognition elements, but no T A T A or CCAAT recognition sequences appear to be present. There are three regions of extended simple repeating sequences within the TM-4 gene. These include a (dG-dT)29 repeat between nucleotides 3280 and

t Abbreviations used: kb, 103 bases or base-pairs; bp, base-pair(s).

Figure 1. The intron-exon organization of the rat TM-4 gene. Four overlapping recombinant ~ phage clones containing the known coding elements of the rat TM-4 gene were isolated from a ~t-EMBL-3 library containing 15 to 20 kb inserts of rat liver genomic DNA. The library was kindly provided by J. W. Tamkun and R. O. Hynes and has been described in detail (Tamkun et al., 1984). Approximately 10 kb of rat genomic sequence on both sides of the TM-4 gene was also encompassed by the clones. In the upper panel, filled boxes represent exons, lines represent introns and open boxes represent vestigial exons. The gene from left to right runs in the 5' to 3' direction. The 5' and 3' untranslated regions are indicated by hatching. The exons are numbered on top in a manner that facilitates comparison to other tropomyosin genes (see Fig. 2) and on the bottom according to the amino acids they encode. The restriction enzyme sites are indicated above the gene: EcoRI(R), BamHI(B), HindIII (H), KpnI(K). An abridged version of the complete nucleotide sequence of the gene encoding rat fibroblast TM-4 is presented in the lower panel. All sequencing was carried out with Sequenase (USB) using single-stranded MI3 as a template. The complete sequence has been submitted to the Genbank and EMBL databases:~. All restriction sites used in subcloning were overlapped. Exons (Ex) are presented in bold type and their encoded amino acids are presented in the l-letter format. Vestigial exons are indicated by VE. Underlined are simple repeats (SR), the Spl elements between 1650 and 1730, the polyadenylation signal (AATAAA) for TM-4 and a 76 bp direct repeat (Dir R). The polyadenylation site is indicated by a filled arrowhead. :~The EMBL/Genbank accession number for the TM-4 gene sequence is Y00169.

402

L e e s - M i l l e r et al.

J.P.

Gene

Minimum

-Size

[]

~][][]

[][]

~ [] [][]

[]

[]

ancestral Rot + chicken [ ]

J'~-']~'~ [ ] [4"~~'] i-~'] [ ] [~"] [ ] [ ]

[ ~ r~-I 10'-13kb

.,9-TM Rot ~,-TM [ ] [ ] ~1~'1 [ ] [ ] [ ] [ ] [ ] [ ] [ ] ~ ~ ~ ~ ~,,,

(o

a-TM

TMnm

~-TM

TM-4

(b) Figmre 2. Organization and evolution of the vertebrate tropomyosin genes. (a) The intron--exon structure of the vertebrate tropomyosin genes were obtained from the following sources, rat fl-TM gene (Helfman et al., 1986), chicken fl-TM gene (Libri et al., 1989), rat e-TM gene (guiz-Opazo & Nadal-Ginard, 1987; Wieczorek et al., 1988; Lees-Miller et al., 1990), and human TMnm gene (Clayton et al., 1988). Vestigial exons are indicated by broken lines. [x)Exon lb is not present in the rat fl-TM gene, but cDNA sequences indicate that it is present in the chicken fl-TM gene (Bradac et al., 1989). ~')The splicing sequence for the 3'-end of skeletal muscle tropomyosins from the a-TM and TMnm genes is 8-9a-9b, where 9b provides only a stop codon and 3'-untranslated sequence. Exon 9b can therefore also be referred to as exon 10. ~=)We have tentatively identified an exon 9c-like sequence in intron 9a of the chicken fl-TM gene. No similar sequence is found in the fl-TM gene of rat. (b) Possible evolutionary relationship between 4 vertebrate tropomyosin genes. The sequence of 13 vertebrate tropomyosins including striated muscle a-tropomyosin from chicken (MacLeod, 1982), quail (Hallauer et al., 1987), rabbit (Stone & Smillie, 1978), rat (Ruiz-Opazo & Nadal-Ginard, 1987) and human (MacLeod & Gooding, 1988), striated muscle fi-tropomyosins from chicken (Libri et al., 1989), rabbit (Mak et al., 1980), rat (Helfman et al., 1986) and human (MacLeod et al., 1985), human slow twitch a-tropomyosin, from the TMnm gene (MacLeod & Gooding, 1988), human fibroblast TM30pl (MacLeod et al., 1987), equine platelet fltropomyosin (Lewis et al., 1983), and rat fibroblast TM-4 (Yamawaki-Kat~oka & Helfman, 1987} were compared over the 176 amino acid residue region encoded by exons 3, 4, 5, 6b, 7 and 8. An alignment of most of these sequences has been presented by Hallauer et al. {1987). Open diamonds (O) indicate gene duplications. The numbers represent a best approximation to the encoded amino acid differences between the genes where speciesspecific differences are ignored. The Figure overestimates by 3 the number of encoded amino acid substitutions between the TMnm gene and the fl-TM gene. The difference is indicated by ~ .

3338, a (dPu-dPy)ss repeat between nucleotides 6148 and 6266 (this pattern is broken by single nucleotide insertions after repeats 12 and 24) and a [(dC/dG)lor2-(dT)2toS]23 repeat between nucleotides 9973 and 10170. E x t e n d e d (dPu-dPy), sequences, of which the former two repeats are an example, are capable of forming Z-DNA and m a y be involved in transcriptional regulation and genetic recombination (Gilmour et al., 1984; H a m a d a et al., 1984; Hellman et al., 1988). There is a 76 bp direct repeat in the TM-4 gene t h a t is located I60 bp upstream from vestigial exon 2b (Fig. 1). The two 76 bp repeats differ at only 16 positions and are separated by a 36 bp sequence t h a t is highly similar to the central region of the 76 bp repeats. All of the animal tropomyosin genes t h a t have been characterized to date have been found to possess one or more sets of alternatively spliced exons including l a and lb, 2a and 2b, 6a and 6b and 9a, 9b, 9c and 9d. F r o m sequence analysis and gene structure, it is clear t h a t rat fibroblast TM-4 is encoded by exon types l b, 3, 4, 5, 6b, 7, 8 and 9d. I t was therefore of interest to search for alternatively spliced exons la, 2a, 2b, 6a, 9a, 9b and 9c in the gene encoding rat fibroblast TM-4. Using the Homology program from Intelligenetics, we found one region with strong identity to exon 2b, and another with strong identity to exon 9a, of the a-TM, fl-TM and hTMnm genes (Figs 2(a) and 3). Exon 2b encodes residues 39 to 80 of the 284 amino acid residue tropomyosins expressed in striated muscle and fibroblasts. In the gene encoding rat fibroblast TM-4, the exon 2b-like sequence is appropriately located upstream from exon l b (Fig. 1). The non-functional nature of exon 2b in the TM-4 gene is immediately clear, in t h a t the 3' splice acceptor consensus sequence CAG is m u t a t e d to CGA (Fig. 3). F u r t h e r deleterious mutations include deletion of the codon for amino acid 68 and a 2 bp insertion between the codons for amino acids 77 and 78. Considering the non-functional nature of exon 2b in the TM-4 gene, it is highly conserved relative to its c o u n t e r p a r t in the ~-TM, fl-TM and hTMnm genes (Fig. 3). Its encoded amino acid sequence is identical at 22 of 42 positions to both a-TM and fl-TM genes, and to an additional three positions only in the fl-TM gene and three positions only in the a-TM gene. I t is also identical at 25 of 42 positions encoded by exon 2b of the hTMnm gene, but to only 12 of 42 positions encoded by exon 2a of the rat a-TM gene. Therefore, it is clear t h a t the vestigial exon at the 5' end of the TM-4 gene is derived from exon 2b. Exon 9a encodes the COOH-terminal 27 amino acid residues of the skeletal muscle isoforms t h a t are produced from the tropomyosin genes b y alternative splicing. The exon 9a-like sequence in the TM-4 gene possesses a 3' splice site consensus sequence (TAG) and an appropriately located stop codon (Fig. 3). However, it is clear t h a t exon 9a of the TM-4 gene is no longer functional. A 4 bp deletion in the region encoding amino acids 260 and 261 destroys the reading frame required to encode a

Communications £Ron

.~-TM ~-T~

2B

]g tcza~cc~c~ct...~ccacaq

TH-4

--g..-t

.......

gcq---c-~a

~-TH ~-TM TH-4

67 Ala Gin Glu Lya 14u G C C C A G G A G t~AJq C~13 ATG ... A .... G --A Her Lys . . . . . .

TH-4

-TH -TH TH-4 TH-4

Leu G u Glu u G n G i n A i a L e u G i n Ly:J L I ~ a L I ~ G l y *l'hr G l u M p G l v v a l GI..~ I,¥a T y g S ~ G l u $ e ~ V I L y 8 A~p (;6 CTG GAG GA~ GAG CAG CA[; GCC CTC C~G AAG A~q CTG AN3 GGG ACA G A G G A T G A G G T G G ~ q AJ~G T A T ~ GAG TCT GCG AAG GAT T----

TH-4

EzOn

403

C-A G£n

--G ---

C-Gin

-TLeu

Olu Gin Ala G£u Lys G A G C A G GCC G A G AJ~G ........ G ..... A . . . . . . . . . . . . . . . . .

AT- TA . . . . . . . . . . . . . T .... He~ Tyr . . . . . . . . . . . . . . . . . .

Ly~ Al• /LAG . . G C C - - A AC - - . . . .

Thr ACC T-$1~

~ TAC -AG Ty[ Ly8

..... C --A ............

..... +

C -G- C-C ...... A ~ p A¢cJ H £ 8 . . . . . .

GAC C .... A --C k s p L~U . . . . . .

A s p SO G A T gtgagCgCggggagg C-G --I----C-¢-aCCGln

9A

2~$ °c~cctcccccttc~tmctccccag tt .... ggggacac.c--c-cg¢--

A~p AT -C ---

G u GAA -GG1¥

V I GTC --, . . . +

Ty[ TAT ... . . .

A • GC.A --C . . .

Gin Ly= CAG AAG --A --. . .

He~ ATG C-C Leu

Ly= AAG CGArg

T • TAC --A $¢p

Ly= AJ~3 -TA Ile

AI• Ile S [ G~C ATC RGC --T ...... .........

GIU Glu u GAG GAG CTG A .... T G-Lye Asp V•I

A~p n Ala ~ G A C AJqC G C G C T C A-- C-T ,TT --A~n HIS ---

A~n p AAT GAC -G- --Set . . .

I • ATC ... . . .

Th• ACT --C . . .

Se[ TCC --T . . .

Leu $tp CTC TGA ...... . . . +

Figure 3. Vestigial exons. We have aligned vestigial exons 2b and 9a of the TM-4 gene with functional exons 2b and 9a of the rat fl-TM gene. Identities to the rat fl-TM gene are indicated by dashes and deletions or insertions by dots. Flanking sequences are typed in lower case letters. The encoded amino acid sequences are in l-letter format. Amino acids that are encoded by the TM-4 gene and are identical to those from the rat e-TM gene and the human TMnm gene, but different from the rat fl-TM gene are indicated by an (*). Those identical only to the fl-TM gene are indicated by a plus sign (+). Amino acids are numbered according to their position in the 284 residue skeletal muscle tropomyosin. Their position in the 7-fold repeating amino acid pattern of tropomyosins is indicated above by abedefg. Translation termination codons are indicated by (Stp).

tropomyosin-like sequence. Other mutations t h a t are likely to have occurred during or following the loss of function of this exon sequence include 1 and 3 bp deletions at the sequences encoding amino acids 277 and 281, respectively, and a C ~ A substitution in the third base of the codon for amino acid 267, resulting in a p r e m a t u r e stop codon. These mutations reduce the n u m b e r of amino acid positions comparable with other tropomyosins from 27 to 23. Vestigial exon 9a is identical to both a and fl skeletal muscle tropomyosins at l l of 23 positions and is identical to a-tropomyosin at a further two positions and to fl-tropomyosin at two other positions (Fig. 3). We have also looked for sequences similar to exon la, 2a, 6a, 9b and 9c in the introns of the TM-4 gene but have not found a n y homologous regions. The lack of these exons strongly suggests t h a t the TM-4 gene is not alternatively processed. Northern blotting with RNAs from a variety of rat cells and tissues, including kidney, brain, liver and fibroblasts did not reveal message sizes other than 2-2 kb (Yamawaki-Kataoka & Helfman, 1987). By contrast, bands of different sizes were found in R N A from smooth and skeletal muscles at relatively low stringency. However, these bands correspond in size to the highly a b u n d a n t muscle m R N A s encoded by the a-TM and fl-TM genes and likely results from cross-hybridization between genes. We have also carried out RNase protection assays with a TM-4 antisense transcript and detected a single hybridization product with R N A from each of soleus muscle, thigh muscle, ventricle, atrium, uterus, tongue, kidney, liver and brain (not shown). These findings differ from an unpublished observation made on the equivalent human gene encoding TM30pl. MacLeod et al. (1987) found evidence from nucleotide sequencing t h a t this gene can produce a 284 amino acid residue tropomyosin. This finding is not contradictory to our results, in t h a t the degree of conservation of vestigial exons 9a and 2b in the rat TM-4

gene indicates t h a t they most likely lost their function subsequent to the last common ancestor of man and rat. Three tropomyosin genes are present in the rat, including the a-TM gene, the fl-TM gene and the TM-4 gene (Helfman et al., 1986; Ruiz-Opazo & Nadal-Ginard, 1987; Yamawaki-Kataoka & Helfman, 1987). H u m a n s possess three equivalent genes in addition to the TMnm gene, which encodes a fibroblast isoform and a slow twitch skeletal muscle isoform (Clayton et al., 1988; MacLeod et al., 1985, 1987; MacLeod & Gooding, 1988), while chickens have at least a-TM and fl-TM genes and possibly a TMnm gene (Bradac et al., 1989; Libri et al., 1989; MacLeod, 1982). The presence of four tropomyosin genes in rabbit has been indicated b y a combination of amino acid sequencing of the a and fl striated muscle isoforms, and two-dimensional polyacrylamide gel electrophoresis o f slow twitch skeletal muscle isoforms (Mak et al., 1980; Heeley et al., 1983). Two-dimensional electrophoretic gels of tropomyosins from a wide variety of rat striated muscles have not clearly d e m o n s t r a t e d isoforms other than a and fl (Carraro et al., 1981; Heeley et al., 1982). These findings corroborate our present knowledge of the rat tropomyosin genes. In the case of the TM-4 gene, the alternatively spliced exons required to encode a striated muscle tropomyosin (exons 2b and 9a) are no longer functional and, despite extensive cloning and characterization of rat cDNAs and genes, no evidence for a TMnm gene analog has been found. The a p p a r e n t absence of two slow twitch skeletal muscle tropomyosin isoforms from r a t m a y point to a significant functional difference between their skeletal muscles and those of other mammals. The exon boundaries within each of the four v e r t e b r a t e tropomyosin genes are identically positioned relative to the m R N A s t h a t t h e y encode. The four genes differ in nucleotide sequence, the presence or absence of certain alternatively spliced

404

J.P.

Lees-Miller et al.

exons, and in the length of introns, 5'-untranslated sequences and 3'-untranslated sequences. Tropomyosin genes have also been characterized from D. melanogaster and Saccharomyces cerevisiae (Basi et al., 1984; Karlik & Fyrberg, 1986; H a n k e & Storti, 1988; Liu & Bretcher, 1989). The D. melanoyaster T M I I gene has a similar organization to the vertebrate genes, including alternative promoters and alternative splicing at its 3' end. The relationship of the D. melanogaster TMII gene to the vertebrate tropomyosin genes has been discussed in detail elsewhere (Hanke & Storti, 1988). B y comparing the structure of the four vertebrate tropomyosin genes, a minimal ancestral gene can be constructed (Fig. 2(a)). This gene must have possessed at least 12 exons, including la, 2b, lb, 3, 4, 5, 6a, 6b, 7, 8, 9a and 9d. Three other exons have been identified in a limited n u m b e r of v e r t e b r a t e tropomyosin genes, including 2a and 9c in the rat a-TM gene and 9b in the rat a-TM gene and the human TMnm gene (Fig. 2(a)). There is not enough comparative sequence information to determine whether these exons were present in the ancestral vertebrate gene. We have located (unpublished results) what appears to be a functional exon 9c-like sequence within the intron between exons 9a and 9d of the chicken fl-tropomyosin sequence as presented by Libri et al. (1989). I f this sequence is shown to be functional in the chicken, it will be a strong indication t h a t exon 9c was also present in the ancestral vertebrate tropomyosin gene (Fig. 2(b)). We have a t t e m p t e d to elucidate the relationship of the four vertebrate tropomyosin genes by comparing 13 sequences over the region encoded by exon types 3, 4, 5, 6b, 7 and 8 (Fig. 2(b)). No gap is present in the alignment of these sequences. The comparison suggests a close relationship between the a-TM and hTMnm genes as pointed out by MacLeod & Gooding (1988). The series of gene duplications t h a t relate the fl-TM and TM-4-1ike genes to the hTMnm and a-TM genes is not clear from the n u m b e r of amino acid differences encoded between these genes. We therefore searched for opposing sequence pairs within each amino acid position and expanded the search to include all sequences where the products of the four genes could be compared, including residues encoded by exons I b, 2b, 9a and 9d (vestigial exons 2b and 9a in the case of the TM-4 gene). There are 12 positions where the a-TM and hTMnm genes are identical but different from identical sequences found in the TM-4 and fl-TM genes. There is no position where a similar situation exists for a-TM and fl-TM genes versus TMnm and TM-4 genes. I t therefore appears t h a t the first gene duplication resulted in one line leading to the a-TM and TMnm genes and another leading to the fl-TM and TM-4 genes (Fig. 2(b)). However, there are also three positions where the TMnm and fl-TM genes are identical but different from identical residues in the a-TM and TM-4 genes. These include the seventh amino acid encoded by exon lb, the 18th amino acid encoded by exon 4 and the 22nd amino acid encoded by exon 5. Therefore,

gene duplications alone cannot account for the pattern of opposing sequence pairs found between the four vertebrate tropomyosin genes. The three flTM, hTMnm versus a-TM, TM-4 substitutions are conserved, either isoleucine : leucine or a l a n i n e : s e r i n e and require only a single base change in the first codon position. The simplest explanation of this pattern is t h a t a degree of convergence has occurred between the fl-TM gene and the hTMnm gene and/or the a-TM gene and the TM-4 gene. We thank J. D. Watson for his continued support and encouragement of this work. We thank J. W. Tamkun and R. O. Hynes for providing the rat genomic DNA library and H. Okayama for the rat brain cDNA library. We thank Mark Zoller for the synthesis of oligonucleotides and James Duffy, Philip Renna and Madeline Szadkowski for help with the preparation of the manuscript. This work was supported by Public Health Service grants GM43049 and CA46370 from the National Institutes of Health to D.M.H.J.P.L. is a Neuromuscular Disease Research Fellow of the Muscular Dystrophy Association.

References

Adelstein, R. S. & Eisenberg, E. (1980). Annu. Rev. Biochem. 49, 921-956. Basi, G. S., Boardman, M. & Storti, R. V. (1984). Mol. Cell. Biol. 4, 2828-2836. Bradae, J. A., Gruber, C. E., Forry-Schaudies, S. & Hughes, S. H. (1989). Mol. Cell. Biol. 9, 185-192. Carraro, U., Catani, C., Dalla Libera, L., Vascon, M. & Zanella, G. (1981). F E B S Letters, 128, 233-236. Clayton, L., Reinach, F. C., Chumbley, G. M. & MacLeod, A. R. (1988). J. Mol. Biol. 201,507-515. Cote, G., Lewis, W. G. & Smil[ie, L. B. (1978). F E B S Letters, 91,237-241. Cote, G. P. & Smillie, L. B. (1981a). J. Biol. Chem. 256, 7257-7261. Cote, G. & Smillie, L. B. (1981b). J. Biol. Chem. 256, 11004-11010.

Gilmour, R. S., Spandidos, D. A., Vass, J. K., Gow, J. W. & Paul, J. (I984). E M B O J. 3, 1263-1272. Hallauer, P. L., Hastings, K. E. M., Baldwin, A. S., Pearson-White, S., Merrifield, P. A. & Emerson, C. P. (1987). J. Biol. Chem. 262, 3590-3596. Hamada, H., Seidman, M., Howard, B. & Gorman, C. M. (1984). Mol. Cell. Biol. 4, 2622-2630. Hanke, P. D. & Storti, R. V. (1988). Mol. Cell. Biol. 8, 3591-3602. Heeley, D. H., Moir, A. J. G. & Perry, S. V. (1982). F E B S Letters, 146, ll5-118. Heeley, D. H., Dhoot, G. K., Fearson, N., Perry, S. V. & Vrbova, G. (1983). F E B S Letters, 152, 282-286. Hegmann, T. E., Lin, J. L.-C. & J. J.-C. (1989). J. Cell Biol. 109, 1141-1152. Helfman, D. M., Cheley, S., Kuismanen, E., Finn, L. A. & Yamawaki-Kataoka, Y. (1986). Mol. Cell. Biol. 6, 3582-3595. Hellman, L., Steen, M.-L., Sundvall, M. & Pettersson, U. (1988). Gene, 68, 93-100. Hendricks, M. & Weintraub, H. (1984). Mol. Cell. Biol. 4, 1823-1833. Ishikawa, R., Yamashiro, S. & Matsumura, F. {1989). J. Biol. Chem. 264, 7490-7497.

Communications

Karlik, C. C. & Fyrberg, E. A. (1986). Mol. Cell. Biol. 6, 1985-1993. Leavitt, J., Latter, G., Lutomski, L., Goldstein, D. & Burheck, S. (1986). Mol. Cell. Biol. 6, 2721-2726. Lees-Miller, J. P., Goodwin, L. O. & Helfman, D. M. (1990). Moi. Cell. Biol. 10, 1729-1742. Lewis, W. G., Cote, G. P., Mak, A. S. & Smillie, L. B. (1983). F E B S Letters, 156, 269-273. Libri, D., Lemonnier, M., Meinnel, T. & Fiszman, M. Y. (1989). J. Biol. Chem. 264, 2935-2944. Lin, C.-S. & Leavitt, J. (1988). Mol. Cell. Biol. 8, 160-168. Lin, J. J.-C., Yamashiro-Matsumura, S. & Matsumura, F. (1984). Cancer Cells, 1, 57-65. Lin, J. J.-C., Hegmann, T. E. & Lin, J. L.-C. (1988). J. Cell Biol. 107, 563-572. Liu, H. & Bretscher, A. (1989). Cell, 57, 233-242. MacLeod, A. R. (1982). Eur. J. Biochem. 126, 293-297. MacLeod, A. R. & Gooding, C. (1988). Mol. Cell. Biol. 8, 433-440. MacLeod, A. R., Houlker, C., Reinach, F. C., Smiilie, L. B., Talbot, K., Modi, G. & Walsh, F. S. (1985). Proc. Nat. Acad. Sci., U.S.A. 82, 7835-7839. MacLeod, A. R., Talbot, K., Smillie, L. B. & Houlker, C. (1987). J. Mol. Biol. 194, 1-10.

405

Mak, A., Smillie, L. & Stewart, G. (1980). J. Biol. Chem. 255, 3647-3655. Matsumura, F. & Yamashiro-Matsumura, S. (1985). J. Biol. Chem. 260, 13851-13859. Matsumura, F., Yamashiro-Matsumura, S. & Lin, J. J.-C. (1983a). J. Biol. Chem. 258, 6636-6644. Matsumura, F., Lin, J. J.-C., Yamashiro-Matsumura, S., Thomas, G. P. & Topp, W. C. (1983b). J. Biol. Chem. 258, 13954-13964. Ruiz-Opazo, N. & Nadal-Ginard, B. (1987). J. Biol. Chem. 262, 4755-4765. Stone, D. & Smillie, L. B. (1978). J. Biol. Chem. 253, 1137-1148. Takenaga, K., Nakamura, Y., Tokunaga, K., Kageyama, H. & Sakiyama, S. 0988). Mol. Cell. Biol. 8, 5561-5565. Tamkun, J. W., Schwarzbauer, J. E. & Hynes, R. O. (1984). Proe. Nat. Acad. Sci., U.S.A. 81, 5140-5144. Wieczorek, D. F., Smith, C. W. J. & Nadal-Ginard, B. (1988). Mol. Cell. Biol. 8, 679-694. Yamawaki-Kataoka, Y. & Helfman, D. M. (1985). J. Biol. Chem. 250, 14440-14445. ¥amawaki-Kataoka, Y. & Helfman, D. M. (1987). J. Biol. Chem. 262, 10791-10800.

Edited by J . K a r n

Structure and complete nucleotide sequence of the gene encoding rat fibroblast tropomyosin 4.

We have isolated and determined the complete nucleotide sequence of the gene that encodes the 248 amino acid residue fibroblast tropomyosin, TM-4. The...
704KB Sizes 0 Downloads 0 Views