GENOMICS

10,

102-113

(1991)

Molecular Characterization of Rat Multigene Family Encoding Proline-Rich Proteins HER H. LIN AND DAVID K. ANN Department

of Pharmacology,

University of Minnesota Received

July 12, 1990;

revised

INTRODUCTION

Salivary proline-rich proteins (PRPs)’ are a subfamily of the salivary contiguous repeat proteins consisting of the proline-rich proteins (Carlson et al.,

used: PRP, protein;

proline-rich protein; GRP, LINE, long interspersed

0888-7543/91$3.00 Copyright 0 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.

December

School, Minneapolis,

Minnesota

55455

18. 1990

1986), glutamic acid/glutamine-rich proteins (GRPs) (Mirels et al., 1987; Heinrich and Habener, 1987), and other closely related proteins. The PRP family differs from other simple repetitive proteins such as GRPs in that the PRP family has an unusually high proline (25 to 45 mol%) content, collectively constituting approximately 70% of the total soluble proteins in human saliva (Bennick, 1982). These unusual proteins are constitutively expressed by human and monkey salivary glands (Carlson et al., 1986; Bennick, 1982). However, the expression of families of similar proteins is induced in parotid and submaxillary glands of rats (Ziemer et aZ., 1984), mice (Clements et aZ., 1985), and hamsters (Mehansho et al., 1987) by treatment with the P-agonist isoproterenol. Several aspects of the responses of rat salivary glands to isoproterenol have been well documented (Brown-Grant, 1961; Schneyer, 1972). A single daily injection of isoproterenol(5 mg/rat) increases the size of the parotid glands by 6- to lo-fold in 5 to 7 days. In addition to the morphological changes, isoproterenol causesa rapid and extremely large induction (approximately 70- to loo-fold) of the expression of PRPs (Ann et al., 1987a). Recent evidence suggeststhat the primary role of PRPs in rodent saliva is to bind polyphenolic compounds such as tannins and tannic acids and that the oral administration of tannins to rats (Mehansho et al., 1983) and mice (Mehansho et al., 1985) mimics the effects of isoproterenol in stimulating PRP synthesis in the parotid gland. Amino acid and electrophoretic analyses, as well as cell-free translations of mRNA, confirmed that the same families of PRPs are induced in the parotid gland by either tannin or isoproterenol administration to rats and mice. This and other nutritional studies have led to the proposal that PRPs may “neutralize” the detrimental effects of tannins in rodents (Mehansho et al., 1987). A common characteristic shared by all PRPs (Carlson et al, 1986) and the closely related GRPs (Mirels et al., 1987; Heinrich and Habener, 1987) is that they

Three members of the rat proline-rich protein multigene family have been characterized. Each of these genes, RP4, RP13, andRP15,containsthreeexonsandtheyareapproximately 4.8, 5.7, and 5.4 kb, respectively. The DNA sequences of RP4 and RP13 are greater than 93% homologous in the 3.1-kb segment extending from the 5’upstream region (approximately nucleotide -930) to 238 nucleotides after the secondexon/intron junction; however, regions further downstream, intron II and exon III, share less than 43% identity. In contrast, exon III from RP15, RP13, and the previously sequenced mouse PRP gene MP2 are more than 73% conserved. These analyses suggest that the duplication of the ancestral genes to RP13 and RP4 occurred prior to the divergence of the rat PRP genes. The results also indicate that in the past 21.5 million years, multiple recombination events have resulted in a very high degree of divergence among intron II and exon III of RP4 and RP13. This divergence is due in part to the insertion of members of the rat long interspersed repeat DNA family at -930 bp upstream from the transcription initiation site and within intron II of RP13. Comparisons of the nucleotide sequences and organization of exon I with the genomic organization of PRP and glutamic acid/glutamine-rich protein genes in this and previous studies reveal striking resemblance among these genes. These observations are consistent with the notion that this super multigene family arose from duplication of progenitor genes via unequal crossing over events. In addition, the results suggest that concerted evolution has occurred within the tandemly repeated motif of exon II. 0 1991 Academic Press, h.

1 Abbreviations acid/glutamine-rich DNA.

Medical

glutamic

repeated 102

SALIVARY

GLAND-SPECIFIC

are composed of four distinct regions: a signal peptide region, a transition region, a repeat region, and a carboxyl-terminal region. The structural analyses and comparisons of PRP genes from various species (Ann and Carlson, 1985; Carlson et al., 1986; Kim and Maeda, 1986; Ann et al., 1988) and rat GRPs (Mirels et al., 1987; Heinrich and Habener, 1987) have confirmed the close relationship among members of this superfamily and suggested that they may have evolved from a common ancestral gene. Several distinct characteristics of these genes are (i) highly conserved 5’-untranslated regions, (ii) conserved exons that encode signal peptides followed by a less conserved transition region, (iii) exons that contain contiguous peptide repeats, and (iv) contiguous repeats that are polymorphic among members in the family; however, the repeating unit tends to be highly conserved within each member. Another unusual feature of this superfamily is that no significant homology among members of different species was detected in the 5’-flanking region other than the 40-bp region immediately upstream of exon I, where 70 to 80% similarity was present (Kim and Maeda, 1986; Ann et al., 1987; Heinrich and Habener, 1987). Since this region includes the putative TATA box, the detected homology most likely reflects part of salivary gland-specific transcriptional regulatory regions. Although the genomic structures of PRPs in human (Kim and Maeda, 1986), mouse (Ann and Carlson, 1985; Ann et al., 1988), and hamster (Ann et al., 1987b) have been determined, a detailed map of the rat PRP gene cluster remains elusive. As part of a long-term effort to elucidate the regulation of the rat PRP gene family and to understand the molecular genetics and evolution of the PRP gene family, we have proceeded to determine the structure, factors affecting expression, and degree of similarity of the rat PRP genes. In this report, we describe the genomic structure of three rat PRP family members, designated RP4, RP13, and RP15. Comparisons of the genomic sequences both within and between species demonstrate concerted evolution of this gene family. In addition, comparisons of putative transcriptional regulatory regions begin to define cis elements responsible for the unique patterns of tissue-specific and inducible expression of these genes. MATERIALS

AND

METHODS

Materials The following substances were purchased from the respective companies and were used according to the specifications of the suppliers: restriction enzymes, DNA polymerase I (Klenow fragment), T4 DNA ligase, Exonuclease III, Sl nuclease, oligo-labeling kit,

MULTIGENE

103

FAMILY

and calf intestine phosphatase were from Bethesda Research Laboratories and from Boehringer Mannheim; [a-32P]dCTP and [a-35S]dATP were from Amersham Corp.; Nytran membrane andElutip-d columns were from Schleicher & Schuell; Sequenase Version 2.0 DNA sequencing kits were from United States Biochemical; agarose, low-melting agarose, 5bromo-4-chloro-3-indoyl-/3-D-galactoside, and isopropyl-1-thio-o-galactoside were from Bethesda Research Laboratories; X in vitro packaging kits were from Stratagene.

Screening of Rat Genomic Libraries Liver DNA was isolated from an individual Sprague-Dawley rat and was used to construct cosmid libraries according to established procedures (DiLella and Woo, 1987; Ann et al., 1988) with slight modifications as described. Generally, genomic DNA was partially digested with Sau3A to a length of 30-50 kb and size-fractionated by centrifugation through a 1.25 to 5 M NaCl gradient. The preparation of cosmid vector (pTCF) arms, ligation, and packaging were performed by the method of Grosveld et al. (1982). The infectious bacteriophage particles were used to transduce Escherichicz coli ED8767, with an efficiency of about 350,000 transformants/pg of size-fractionated genomic DNA. Approximately 700,000 recombinant colonies were grown on Nytran filters (70,000 colonies/l50-mm filter) and were hybridized to 32Plabeled rat PRP cDNA probe2 according to the screening procedure of Hanahan and Meselson (1983). DNA was isolated from positive colonies by alkaline lysis (Birnboim, 1983). Genomic Southern blotting and hybridization were carried out as described previously (Ann et al., 1988).

Characterization

of Cosmid Clones

The pTCF vector contains two Sal1 sites flanking the BamHI cloning site, which facilitated mapping of the positive rat PRP clones. Complete restriction maps were derived by the procedures described below. The coding regions of the genes were mapped by Southern blotting (Southern, 1975) of restriction digests and hybridization with a full-length rat PRP cDNA probe. Then the EcoRI, BamHI, XbaI, Hi&III, and Sal1 restriction fragments were ordered by a modified partial digestion followed by indirect end-labeling. This was accomplished by first digesting the recombinant cosmid DNAs with NruI and then incubating the digested cosmid DNAs with the appropriate restriction enzymes. Aliquots of the digest were removed at several intervals between 1 and

a D. K. Ann

(1990)

unpublished

results.

104

LIN

AND

ANN

8.4 7.2

9 *

5.1 2:

+ I *

2.3 1.9

-c

1.4 1.3

+ -c

0.7

+

I%

B

E

H

BEEEHB

BEHBEH

FIG. 1. Southern blot analysis of rat PRP genes and cosmid clones. Sprague-Dawley rat liver genomic DNA (10 pg) and rat PRP cosmid clones RP4, RP5, RP13, and RP15 DNAs (1 pg) were cleaved to completion by BamHI (B), EcoRI (E), and Hind111 (H), respectively. After the DNA fragments were separated by agarose gel electrophoresis, Southern blots of these DNA fragments were hybridized with aZP-labeled rat PRP cDNA probe. Size markers (kb) are shown in the left margin.

30 min and the digestions were terminated by adding 5 vol of 0.5 M EDTA. The aliquots were combined and subjected to electrophoresis, and the separated fragments were blotted to a Nytran membrane. Two specific vector probes (NruI/SaZI and SaZI/NruI) from the left and right sides of the BamHI cloning site, respectively, were labeled and hybridized to the blots. The exact maps were ascertained by subcloning either Hind111 or EcoRI fragments into pUC19 for detailed restriction enzyme analyses. DNA

Sequence Analysis

DNA fragments containing the appropriate PRP coding regions and flanking regions were subcloned from cosmid clones. A nest of unidirectionally deleted inserts, from either end separately, was generated by Exonuclease III and Sl nuclease digestions as described (Henikoff, 1984). All DNA sequencing was performed by the method of Sanger et al. (1977) using [LX-~~S]~ATP and Sequenase. Part of the repetitious exon II of each clone was confirmed by the chemical method of Maxam and Gilbert (1980). More than 90% of the nucleotide sequences were determined in both directions. Nucleotide sequences were analyzed using MBIR software (Manual of Molecular Biology Information Resource, 1989) provided by the University of Minnesota Microbiology Computer Group. RESULTS

Structure

AND

DISCUSSION

of the Rat PRP Gene Family

To detect genomic DNA fragments that encode PRP gene, high-molecular-weight DNA from the

liver of individual Sprague-Dawley rat was analyzed by Southern blotting using a cloned radiolabeled cDNA probe for rat PRP as described under Materials and Methods. Figure 1A shows that digestion of rat genomic DNA with infrequently cutting restriction enzymes generates a minimum of six unique DNA fragments containing more than 60 kb of genomic DNA that hybridize with different intensities to the PRP cDNA probe. The different intensities of the various restriction fragments can represent unequal portions of coding DNA or multiple copies of closely related genes. Because of the size of rat PRP genes, we constructed a genomic DNA library using a cosmid vector (pTCF) that can accommodate large DNA inserts. From two separately constructed and unamplified rat cosmid libraries, 1.4 X lo6 recombinant cosmids were filter-duplicated and screened with the rat PRP cDNA probe. Ten colonies gave positive hybridization signals. After three cycles of colony purification, four colonies, RP4, RP5, RP13, and RP15, hybridized with the PRP probe. DNA prepared from these clones was analyzed by restriction enzyme digestion and Southern blotting as shown in Fig. 1B. Each of these four clones contains a unique PRP gene or portion of a PRP gene and was characterized in detail. However, our attempts to isolate the complete PRP locus was made more difficult by the very biased distribution against PRP cosmid clones (4 of 1.4 X lo6 individual colonies). This cannot be explained by variability of hybridization signals during the screening of the libraries, since we were careful to include even weakly hybridizing colonies in our selection. One possible explanation is that there is a greater abundance of Sau3A sites within the gene clusters than in the

SALIVARY

GLAND-SPECIFIC

MULTIGENE

105

FAMILY

1 Lb A

\

RP4

I

I I I I

I

I I

1

I

II

II I

I I

::

I

I

1 I II

I

I

I

1

I I

I

1Lb

B 1 kb

PEB

B

,

Fi

,,,I

‘,

,

EPT

I

,

I I

I I

I



HEH

II I I

I

I

III1111

I I

I I

I

I

I II

I

t

I

I

I

I

I

I

I

I

I

I

RPl3 1%

I

RP5

C

1 Lb

B

E -X

PE

7

RP15 I

E” H

I



III

I II

I I

I

I 1

I I

FIG. 2. Molecular map of the PRP genes. Exons are shown as filled boxes. The direction of transcription of the genes is shown by an arrow. Overlapping clones RP5 and RP13 are diagrammed below the restriction map (B). Indicated above the maps are the subclones used for nucleotide sequencing. The restriction enzymes used for mapping PRP genes were B, BumHI; E, EcoRI; H, HindIII; S, SalI; and X, XbaI. PstI sites (P) were mapped only within the subclones.

flanking region, although this is not reflected by any bias in the distribution of BamHI sites. A second possibility is that regions in the vicinity of the PRP genes could have a higher density of EcoK sites and therefore might be lost during in vitro packaging due to EcoK activity (Rosenberg, 1985). A third explanation is that certain regions either cannot be propagated in our cosmid vectors or give rise to such small colonies that no hybridization signal can be detected. A compiled restriction map of these four cosmid clones is shown in Fig. 2. Clones RP13 and RP5 are overlapping clones covering 51 kb of contiguous genomic DNA. These maps also demonstrate that RP4, RP13, and RP15 share some restriction sites within and near the PRP coding regions. For instance, there are internal EcoRI sites in exon I of all three clones, and at least two EcoRI sites are located near the 3’end of the PRP genes. A 4.8-kb Hind111 fragment of RP4 and a 5.4-kb BamHI/EcoRI fragment of RP15 that hybridized to PRP probe were subcloned into

pUC19 for sequence analysis.3 For clone RP13, a total of 5778 bp containing a PRP gene was sequenced.3 The three genes, each possessing three exons and two introns, have an organization similar to that of the mouse PRP genes MP2 and Ml4 that we described previously (Ann et al., 1988). All nucleotide sequences at the RNA slicing donor and acceptor sites in RP4, RP13, and RP15, except the first exon/intron junction (ATG/GT) of RP4 and RP13, are in agreement with consensus sequences described by Mount (1982). Exon I is approximately 97 bp in length. In addition to a short 5’-untranslated region, it encodes a secretory signal sequence 16 amino acids long and the first 5 residues of the region designated a “transition region” by Ann and Carlson (1985). Exon II, located

3 The sequencing data have been brary under Accession Nos. M36412 M36414 (RP15).

deposited with (RP4), M36413

GenBank (RP13),

Liand

106

LIN

RP4 pRP25

RP4 pP.P*s

______________--__------------------------------------------

50

RP4 pm25

T

161

60

"~l"~lyProProProGl"ClyGlyPrc~lnGl"GlnLysP~~P~~GlnPr~GlyLy~ProGl GCRAGGCCCACCCCCACAAGGAGGCCULCAACAGAAACCACCCC* ------GC--C-TG----------------ser b"

240 221

80 'OT "GlyProThrPraProGlyClyProGlnGl"LysProProGlnP~OGl"As"Gl"Gl"Gl AGGCCCCACCCCACCAGGAGGCCCC*~G~G-CCCCCTC*GCCTGG-CC~C~GG P.P4 &SW25 --A---AC------A----A----------------A---------------G-CC----Gl" ser LySPrO Asp Pro

RP4 pm25

RP4 pw25

w4 plzP25

RP4 pm25

300 28,

100 y% ~~P~~Pr~Pr~GlyGlyProGlnGl"Ly~Pr~ProGlnProGlyLy~ProGl"GlyPr CCCACCCCCACCAGGAGGCCCACARCAGARRCC*CCTC~GCCT~-GCCCC~GGCCC ------------------------GA--------C-------------------------LYS 120 110 oProProProGlyGlyProGl"GlnArgProProGl"ProGlyAs"GlnGlnGlyP~oPr ACCCCCACCAGGAGGCCCACAGCAGAGACCTCCTCAGCCTGCCC~CC ----------------------A----*---c---.----------G.cc--------.*LYSLYS LySPrO 140 130 oProProGlyGlyProGl"ClnLysProProGl"ProGlyLy~ProGlnGlyP~~ProPr CCCACCAGGAGGCCCACAGCAGAAI\CCCCCTCAGCCTGGGCCC*CCCCC --------..-----T------.----~---.--G.----------.------------r Ah

T 360 341

T 420 401 Th T

160 150 rProGlyGlnProGl"GlnLysProP~oGlnProGl"LysP~~Gl"GlyPr~P~~ProPr ACCAGGAGGCCCACAGCAGlCCACCTCAGCCTC*GCCTGG-GCCCC~GGCCC~CCCCC*CC --------..--T--------.--c--------.-~-~~c-~.----~------...--

480 461

540 521

AsnGln

RP4 pw25

180 170 OG~~G~~P~OG~"G~"A~~P~~P~~G~"P~~G~~AS"G~"G~"S~~P~~P~~G~"G~~P~ AGGAGGCCCACAGCAGAGRCCTCCTCAGCCAGGAAACCAGGTCC -----------(end Of pRP25, 172

200 204 190 oGlnLeuA~pArgProGl"GlySerPheGl"~~~~~"G~yP~~Gl"*** RP4 CCRATTGGACAGI\CCACAGGGATCTTTCCI\WLGTTTGGGTCCTC*GT-CC~G~TCCT RP4 CTGAARGGTTWLTULTTTTATTAATGTTGTGAATCTCCAG~TC RP4 AGAAARACAGGAGGGAGAGRCTTCTTACTCACTTTCTWLAGAC

600 532

660 720 778

FIG. 3. Composite comparisons of nucleotide and amino acid sequences of RP4 and rat PRP cDNA pRP25. The nucleotide sequence and encoded peptide sequence of exonic regions of RP4 are given. Differences in the nucleotide sequences and derived amino acid sequences observed in pRP25 (12) (dashes) are indicated. Asterisks (***) represent the translation stop codon TAA. The arrow (+ I+) indicates the exon/exon junctions. There are 32 nucleotide differences with 17 amino acid changes. The triangle (V) indicates each tandemly repeated motif. Residues 90-108 are overlined to represent a typical 19-amino-acid repeat of RP4. Polyadenylation signal and ATG are boxed.

approximately 1600 bp downstream of exon I, is the largest exon in all three genes. Exon III is located approximately 600 to 700 bp downstream of exon II. This exon is untranslated and contains a polyadenylation signal, AATAAA (Breathnach and Chambon, 1981). The close homologies between the derived amino acid sequences of RP4 and RP15 and the rodents’ known PRP sequences (Carlson et aZ., 1986; Ziemer et al., 1984; Clements et al., 1985; Ann and

AND

ANN

Carlson, 1985) are aptly illustrated by comparing the exonic sequences of RP4 and rat PRP cDNA pRP25 (Clements et al., 1985) (Fig. 3), and exonic sequences of RP15 and mouse PRP gene MP2 (Ann and Carlson, 1985) (Fig. 4). As illustrated in Fig. 3, the sequences of RP4 and pRP25 are virtually identical through amino acid 60 or nucleotide 195. In region 3’ to this, single nucleotide substitutions are observed. It is not necessary to introduce any gaps or insertions to preserve the homology, and no premature termination codon is introduced by these nucleotide substitutions. There are 32 nucleotide substitutions out of 532 bases compared, resulting in the replacement of 17 of 172 amino acids. We conclude that either the derived peptide of pRP25 cDNA or a pRP25-like peptide is the gene product of rat PRP gene RP4. The only open reading frame in RP15 encodes a polypeptide consisting of a tandemly repeated unit of 13 amino acids (Fig. 4). The repeat prototype of the derived amino acid sequence is PPPPGGPQ@PQG, which shares striking homology with the mouse PRP gene MP2 product, GP66sm (Ann and Carlson, 1985) (Fig. 4). GP66sm contains 13 tandemly repeated peptides of 14 amino acids with a prototype of PPPPGGPQmPQG. The first 8 amino acids and last 3 amino acids of these two prototype repeats are identical. The only differences are Q and G in positions 9 and 10 of RP15 compared to P, R, and P in positions 9, 10, and 11 of MP2, respectively. Detailed alignments of the nucleotide sequences of the repeat prototypes from RP15 and MP2 indicate that the observed amino acid differences were the result of two single base substitutions and an insertion of CCC (Fig. 6). Even the transition region (residues 17-51) and carboxyl-terminal region, which are the most divergent sections of all PRPs (Carlson et al., 1986), are highly conserved between RP15 and MP2 (Fig. 4). On the basis of these comparisons, we conclude that RP15 is the rat homolog of the mouse PRP gene MP2. Nucleic acid homologies between RP13 and other PRP genes are discussed below.

Comparative

Analyses of PRP Genes

We have previously shown that PRPs are largely composed of a multiply repeated motif in a tandem arrangement that codes for a highly conserved proline-rich peptide (Carlson et al., 1986). The available gene structures for rat PRP genes RP4, RP13, and RP15 and mouse PRP gene MP2 make possible the analysis of the genomic organization of their repeats, evolution, and the possible factors that led to their structural divergences. Figure 5 shows intersequence comparison matrices of these four genes. Scaling of

SALIVARY

RPlS

RPlS

-----------T-------A---C---------..-~-----~-.-----G---~-----Gl” 150

---C----C--~--G-G-...----*-~--------~~---------~------gg~~=~ Gl"Pl-0 Arg A=3 Gl" 250

GLAND-SPECIFIC

528 160

GlY

819 GlYPk-0 260

MULTIGENE

FAMILY

107

the plots is proportional to the actual sizes of the genes. The same statistical thresholds were used for all comparisons (Lawrence and Goldman, 1988). In this analysis, if a given sequence is compared with itself, a diagonal at 135” is produced. Any additional lines parallel to the diagonal indicate internal homologies between segments whose 5’ and 3’ ends can be determined from upper left and lower right ends of the corresponding lines, respectively. This is most nearly demonstrated by the plot of RP4 versus RP13 in Fig. 5A. This plot shows virtually no spurious background. The displacement of the diagonal line indicates (i) the insertion of 29 nucleotides in intron I of RP13, (ii) fractional differences in the simple repetitive sequences (TAGA) between the RP13 and RP4 genes, and (iii) two less 57-nucleotide internal repeats in RP13 than in RP4. However, the sequences upstream from approximately -930 bp of RP4 and RP13 show little homology. Even the exons III of these two genes are not homologous, which is in contrast with that of RP15 versus MP2 (Fig. 5B). In Fig. 5B, it is demonstrated that the homologies between RP15 and MP2 extend across exon III with multiple displacements of the diagonal line. The existence of more than one major diagonal line in this plot (Fig. 5B) is due to the slippage caused by the presence of simple repetitive sequences (TAGA and TTA) in introns I and II of the MP2 gene (Ann and Carlson, 1985). The matrix comparison of RP15 versus RP4 is illustrated in Fig. 5C. This diagonal line contains several “blank zones” located essentially in the 5’-upstream region, intron I, and the area 3’-downstream from exon II. The appearance of blank zones indicates more extensive sequence substitutions between RP15 and RP4. However, when the pattern fades, it always reappears on the same diagonal line, indicating that the overall homology between these two genes is intact (i.e., uninterrupted by multiple deletions and/or insertions). It is worthwhile to note that sequences of RP15 and RP4 located in the region 5’-

comparisons of nucleotide and amino acid FIG. 4. Composite sequences of RP15 and MP2. The nucleotide sequence and encoded peptide sequence of exonic regions of MP2 (GP66sm (22)) are given. Differences in the exonic sequences and derived amino acids observed in RPl5 (dashes) are indicated. Lowercase letters are unaligned bases in RP15. Gaps are introduces as dots (.). Asterisks (***) represent the translation termination codon TAA or TGA. Arrows (+ ) +) indicate exon/exon junctions. Polyadenylation signals and ATG are boxed. Residues 94-107 are overlined to represent a typical 14-amino-acid repeat of MP2, and residues 92104 are underlined to represent a typical 13-amino-acid repeat of RP15. The triangles (A or v) indicate each tandemly repeated motif. The transition region and carboxyl-terminal region are bracketed.

LIN

AND

ANN

UP15

FIG. 5. Intersequence boxes. The positions of an arrow. The program score/standard deviation) is 0.43). This threshold along the left ordinates

comparison matrices. The program used (25) is a variant of DOTMATRIX. The exons are shown as cross-hatched two LIRn inserts in RP13 are indicated. The direction of transcription of the gene or the inserted LIRn is shown by searches for similarity between pairs of nucleotide sequences. A threshold SD score (similarity score - expected of 3.0 was used for plotting (the expected similarity score for two random sequences is 4.99, the standard deviation SD score was approximately equivalent to a minimum length of 15 nucleotides for a homology domain. Each scale and lower abscissas represents 1000 bp.

upstream from -930 bp still remain homologous in several segments, which is in contrast to the comparison of RP13 and RP4 (Fig. 5A versus Fig. 50. The most surprising results were observed with exon III of RP4. Although the rat RP4 gene is closely related to other PRP genes when considered in their entireties, exon III of RP4 shares less than 50% homology with the corresponding exon III of MP2, RP13, and RP15 (Figs. 5A, C, and E versus Figs. 5B and F). The lack of conservation of exon III and the 3’-downstream regions of RP4 implies that these segments may not play an important role in regulating PRP gene expression. Because of the frequent correlation of exons with functional domains in proteins (Gilbert, 1985), we analyzed individually the similarity of exons and introns in the rat and mouse PRP genes. The results are shown in Table 1. Exon I of the PRP genes, represent-

ing the 5’-untranslated region and signal peptide, is the most significantly conserved. This result was not unexpected because a great degree of homology between exons I of all PRPs from different species and the closely related GRPs has been documented previously by Ann et al. (1988). Analysis of exon II (which is the largest exon and contains most of the highly conserved tandem repeats) confirmed our earlier conclusion that the repeated motifs in RP13, RP4, and RP15 are related to that in the mouse PRP gene, MP2. Optimal alignments for the prototype repeat of these four genes are shown in Fig. 6. Although exons II from RP13 and RP4 share the greatest sequence homology, exons II from RP15 and MP2 are more closely related in length. Sequence conservation of this region might imply some essential function of these proteins. It is likely that careful comparative analyses of other individual PRP and GRP exons may

SALIVARY

GLAND-SPECIFIC

MULTIGENE

TABLE Percentage

Nucleotide Number

Sequences of nucleotide

AB 5’-Flanking Exon I Intron I Exon IIb Intron II’ Exon III

region

271 13 777 121 143 33

DE

289 13 899 96 425 63

278 15 806 180 209 26

97 2 103 9 381 61

1

of 5’-Flanking

substitutions

C

294 13 864 125 279 32

Identity

109

FAMILY

Number

Regions, of nucleotide

Exons,

and Introns

compared

of PRP Genes” Percentage

F

A

B

C

D

E

F

310 15 834 100 558 63

787 97 1731 495 583 124

738 97 1813 743 464 125

762 97 1783 539 744 120

966 97 1849 484 444 120

935 97 1528 455 730 120

973 97 1867 563 928 120

ABCDEF 63 87 50 75 52 74

63 87 57 84 69 74

62 87 50 82 43 48

a Comparisons: A, RP13 versus MP2; B, RP15 versus MP2; C, RP4 versus MP2; D, RP13 versus RP15; E, RP13 versus versus RP4. b Only corresponding repeats of each pair are taken into account (see text) since major part of exon II comprises tandem ’ The LIRn-PRP2 insert of RP13 is not included.

also reveal some unexpected or evolutionary significance.

homologies

of functional

from a Common

RP13 and RP4 Originated Ancestral Gene

Alignment of RP13 and RP4 reveals considerably higher similarity than the other comparisons (Table 1 and Fig. 5). The similarity of the first two exons is approximately 98%, and the promoter and 5’-upstream regions exhibit 89.6% identity (Table 1). Surprisingly, the average similarity of intron I between RP13 and RP4 is 93.3%, which is higher than that of

PPPPGGPQPG.. I I I I I I I I PPPPGGPPQKPPQPGKPQG I I I I I I I I PPPPGGPQQRPPOPGKPQG IIIlIIII

.

.

I I

I

I

I

I

I

62 85 55 82 40 48

RP4;

F, RP15

repeats.

GTCCTCCAAATTTTGCCTTCTATAGTTATTGCCATGTCGTATATGTTTCTGGCAAGAA ---------------T----c-C-----------------------------------

(57) . .PQG,13) I I

. * . .

ll*llIIII . . CCTCAAGGC

(42,

I

I

I

I

I

I

I

I

RP13 GCCCTTTTCT P.P 4 It-------f

TTTTTTTTTTTTTTTTTTTTTTTTTTATTPACTTGAGTAT’PCTTATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..I . . . .

re13 Fe4

L1Fu-rPW2(282bpj GAA+GCCCTTTTCT1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .._......

RP13 Rp 4

CCTAAGGTCCTTTACCTCATGGCTGGGACTCATGAATTC.........ATAAGATACACA -------------------A-------------------g~gc~ggg-CC--TC---G-

RP13 RP4

163bp 41Obp

INTRONZ

(19) *I

90 98 93 98 48 49

ll*lllllI

(19) I

71 85 56 63 53 78

exon I among other pairs of PRPs compared (Table 1). Although interpretation of these values is difficult inasmuch as gene conversion-like mechanisms may obscure evolutionary distances, the remarkable similarity of RP13 and RP4 suggests that these PRP genes evolved from a common ancestor. On the assumption that selective pressures operating on the introns of the PRP genes are insignificant and that the accumulation of mutations occurs with similar frequencies in the genes, the nucleotide sequence differences provide an estimate of the time elapsed since the genes diverged. It has been estimated that introns accumulate 0.3% nucleotide dif-

P.Pl3 Rp 4

cc*cccccncc*GG*GGccc~c~G~~..................ccTcI\AGGc,39) lIIIIII/IIIIlIIlllllIIIIIII I CCRCCCCCAC~GG*GGCCCACAGCAGAAACCI\CCTCAGGGc(57, IIIIlIIIIIIIIlIlIIIIlII*IIIIIlIIIIIIIIIIIlIIllIlIIIlIIIII CCACCCCCACWLGWLGGCCCACAACAGAARCCACCTCAGCC IIIII*IIIIIIIIIIIIIIIII*I Il*lII* ccAcc*ccAccAGGAGGcccAc*~~.

identity

PPPPGGPPPRP.....PPG(14)

FIG. 6. Comparison of consensus repeat sequences of rat and mouse PRP genes. (A) Nucleotide sequences of consensus repeats from RP15 (nucleotides 307-345, Fig. 4), RP4 (nucleotides 302358, Fig. 3), MP2 (nucleotides 313-354, Fig. 4), and RP13 (corresponding to nucleotides 302-358 of RP4) are aligned. Gaps introduced in the alignment are shown as (.). Vertical lines indicate identical nucleotides and asterisks indicate silent substitutions. The number of nucleotides in each consensus repeat sequence is shown in parentheses. The codons for different amino acids of RP15 and MP2 are underlined. (B) The derived amino acid sequences from each consensus repeat in A are aligned to show maximal homology. Gaps are represented as (.). Vertical lines indicate identical amino acids and asterisks indicate conservative substitutions. The number of amino acids in each repeat is shown in parentheses.

. . . . ..AAAGTGAGTAAGAAGTTCTCTCCCTCCTGTTTTTCAACT..GTCACGGCAGGTC GATCAACCTG....CAACTT Rp 4

-A---A-----gac---TT-ctc--T--&g

F'S13 P.P4

CTGATTGCTGTCTT -A--GA-AACATGA

FIG. 7. Multiple rearrangements in 3’-regions of RP13 and RP4. Identical bases (dashes), aligned nonidentical bases (uppercase), or unaligned bases (lowercase) of RP13 (upper row) and RP4 (lower row) are indicated. Gaps introduced for proper alignment are shown as dots. The two 12-bp perfect repeats in RP13 and one repeat with a G/C mismatch counterpart in RP4 are boxed. The intron II/exon III junction is indicated. The polyadenylation signal of RP13 and RP4 is boxed and underlined.

110

LIN

AND

ANN

w13 MP 2 Fe 4 RP15

l-7491 (-684) C-748) (-75.9)

: : : :

CTTAAATAGGACAATTGAMAGT' ---T-----A------CT---A ------G-M------A-G--AG ---------A-TG---T----TA(

RP13 MP 2 RP 4 RP15

f-6891 (-626) t-688) (-698)

: : : :

CAATGAATCATCAAGGAATTAGAGATT.CATAATACTGACTTGTATAAATCAAATATGGG A--CA-C---C---,------CG----,--AG---T-----CA-G--TGC-TC-------T--------A-----T------C---.--,---G-A----------C-----T-C------A-GCA-C-T------------~-----~-T-G-----A-----A--A~GG------CC-C-C-T-

Rp13 MP 2 RP 4 P.Pl5

t-630) (-568) (-629) (-638)

: TT.....TCCAGGTTCTTTTTACTTGGATTTTAG.TGAAATCTTAGAACATATGT -----AT-GG-CT--A---a-----...---GT-TC---C : --gtgac----.... ----A-CCA------------------,--T---AC---G------A: .-..... : --.....-TG--AC-ACA---G--G--------T.---TG-AA---GT--C----

RP13 MP 2 RI 4 RP15

(-576) (-515) (-576) (-584)

: : : :

RP13 MP 2 W 4 RP15

(-526) f-465) (-526) (-524)

: : : :

RP13 MP 2 RP 4 w15

(-484) (-409) (-484) (-484)

: : : :

RP13 MP 2 w 4 P.Pl5

(-424) (-349) (-424) (-434)

: : : :

RF13 MP 2 F'.P 4 RP15

(-366) l-290) (-366) (-375)

: : : :

RP13 MP 2 MP 4 W15

l-306) (-231) t-306) (-316)

: : : :

AGTATTATACAGAGAGTCTCAT -T-GA--C---A-A-A-G---G -------------C----------GA--C-----A-A-A-T-G

F'S13 MP 2 W 4 RP15

(-255) (-180) (-255) (-256)

: : : :

. . . ..GATCCTATGACTTCAATATCTGGTCTATTCTCTTGAACATT . . . ..AG--..--T---G-----A----GT---..........------A-A-AGG-G-. .._.~---___--______--____________________------------------caatcA--TG----,TA-------A---G----A--------

RP13 MP 2 RP 4 RPl5

f-200) (-138) (-2001 (-197)

: : : :

WULTATRATCCTTGTTGTACCATCGAGRAAGGCAGCACTGT . . . . . . . . . . . --T--C--GT-GG--C-CTA-----...,.T-G--C---C--------_______________-________________________-------------------T-C-C-----A------A-----TT-----A-GCAT----G--A--T---C---------

Fe13 MP 2 Fe 4 P.Pl5

(-140) I- 94) (-140) (-137)

: : : :

CCTGCTGGGCAAATATCCCAGTGTGGAGTCAGGGATGCAAT ----A--------AG---------.................................... --------------G----------------------T---------------A-------A----------AG--------------A-A----------------------------

RP13 MP 2 RP 4 w15

( ( ( (

-80) -70) -80) -77)

: : : :

RF13 MP 2 RP 4 Fe15

( ( ( (

-21) -21) -21) -20)

: : : _--__--____-___ : -T-.-T----C---

RP13 MP 2 RP 4 Fe15

( ( ( (

+40) t4uj +401 +401

: : : :

d I

TTTTGAGGG.TTAGC............. . . ..TGACTGGGTGTAATAATTC -AC-----C.-----atgttagggattccaagG-T--T----...-----G --------. ----. . . . . . . . . . . . . . . . . --G-A--------C---C-A-G--Tt-G-AG.................---...T--T-C--G---a TGGACACCATCAATATTAAGTAAGTACAGTATAGGG -CA---------C-C----C---T--T--A-T-----A-A---T-------CA----------------C---C-G-------------------------T-----T-C-.... ---CA-T--A-------M-T----C-GAGAGACAGTG.CCT.GGTTCAATCTTGTTA -A---A-T-T.AT-aCC----G-T-CM-G

FIG. 8. Comparison of the 5’-flanking and exon I sequences of rat and mouse PRP taken from Ann and Carlson (1). Identical nucleotides (dashes), aligned nonidentical

CTATACTGAT......... ----T--T-......... ----------._....._. A----AC---tgtatttat

genes. The nucleotides

sequences of mouse PRP gene MP2 are (uppercase), or unaligned nucleotides

SALIVARY

GLAND-SPECIFIC

ference in one million years (Maeda et al., 1983). Given this figure it is evident the primordial RP13 and RP4 duplicated 21.5 million years ago. This event may have occurred after the time of mammalian speciation. Accordingly, RP13 and RP4 are the analogs of the two very closely related mouse PRP genes, MP2 and M14, that we reported previously (Ann et al., 1988). Excluding the TAGA simple repetitive sequences in RP4 and RP13 and two 57-nucleotide repeats of exons II in RP13, RP13 and RP4 share 93.7% identity extending from 930 nucleotides upstream from the cap site to 238 nucleotides after the second exon/intron boundary (Fig. 5A). However, nucleotide sequences further upstream and downstream from this region, except for one 39-nucleotide stretch in intron II, are not homologous (>70% divergence, Fig. 5A). Thus, the comparisons reveal that these two regions in RP13 are essentially unrelated to the corresponding regions from RP4. To find a possible source for these two DNA fragments, a computer search through the GenBank library (version .62) for sequences homologous to the sequences from these two regions in RP13 was performed. The results of this search suggested that the 5’-upstream segment of RP13 (LIRnPRPl) was a member of the rat long interspersed repeat DNA (LIRn) family. Comparison of the first 1409 sequenced nucleotides (-931 to -2339) of RP13 and the rat LINE3 (d’Ambrosio et al., 1986) clearly showed that rat LINE3 nucleotide sequences from 5705 to 7023 had been transposed into the 5’-upstream region of RP13 with the same transcriptional direction (data not shown). This LIRn-PRPl, like most rodent LINE elements, contains the typical polyadenylation signal AATAAA and an A-rich homopurine stretch (d’Ambrosio et al., 1986; Soures et al., 1985). Several features of the nucleotide sequences present in RP13 and RP4 suggest that the intron II of RP13 and the exon III of RP4 have undergone multiple rearrangements. The first of these features is that RP13 contains an insertion of 334 bp flanked by 12bp direct repeats (AGCCCTTTTCTT) (Fig. 7). A search of the GenBank library indicated that the 334bp inserted DNA fragment (LIRn-PRP2) was also a member of the LIRn family (Soures et al., 1985; d’Ambrosio and Furano, 1987). Detailed alignment suggested that nucleotide sequences 51 to 283 of RLl1 (d’Ambrosio and Furano, 1987), along with an A/Grich homopurine segment, had been transposed into the second intron in the transcriptional direction op-

MULTIGENE

111

FAMILY

posite to that of RP13. The 12-bp repeat sequence (AGCCCTTTTCTT) is present only once, with one mismatch in RP4 (Fig. 7). The presence of this direction repeat sequence, apparently generated by duplication of the single target sequence, supports the hypothesis that LIRn-PRP2 entered the RP13 locus via transposition. Second, the 39-bp segment immediately following the second copy of the 12-bp direct repeat in intron II of RP13 is nearly identical, with the exception of one A/T mismatch, to the analogous region of RP4 (Fig. 7). However, the DNA sequences from RP4 and RP13 that follow this 39-bp stretch are unrelated (Fig. 7). As shown in Table 1, RP13 and RP4 share less than 50% identity in their exon III, whereas exon III of RP13 shares more than 74% identity with exon III from either RP15 or MP2. Thus, exon III and regions further downstream in RP4 are almost completely unrelated to those of other PRP genes, except for the functionally conserved sequences for polyadenylation. Nucleotide sequences of exon III of RP4 do not display any similarity with other known sequences reported in the gene bank to date. A complete history of the rearrangements of the 5 and 3’ regions of RP13 and RP4 cannot be deduced from our analyses. However, the data indicate that several interdependent and/or independent rearrangements have occurred in these two regions. For example, the results suggest that after RP13 and RP4 evolved from their common ancestral gene, at least one rearrangement resulted in replacement of the region 5’-upstream from -930 in RP13 with a member of LIRn. A separate rearrangement could have resulted in the introduction of LIRn-PRP2 into intron II of RP13 in the transcriptional directions opposite to those of RP13 and LIRn-PRPl. A quite recent substitution to RP4 immediately follows the 39-bp conserved segment (Fig. 7), spans intron II and exon III, and includes the AATAAA polyadenylation signal. This substitution accounts for the divergence of sequences between exon III of RP4 and that of RP13 and other PRPs.

Concerted Evolution

of PRP Genes

As shown in Table 1, the percentage identity between the corresponding exons of the PRP genes is significantly higher than that in the corresponding introns, indicating that selection and/or concerted evolution has occurred over an extended time scale. The repetitious exon II shows a percentage difference

(lowercase) of RP13, MP2, RP4, and RP15 are indicated. Gaps introduced for proper alignment are indicated as dots. Transcription initiation sites are indicated by arrows and designated as fl. The first translated ATG is boxed and overlined, and the TATAAA is boxed and double overlined. The exon/intron junction is indicated. Five conserved segments with possible involvement in salivary-specific expression (see text) are highlighted.

112

LIN

between RP13 and RP15 higher than that between other exons (Table 1). Of special interest is the observation that substitutions and insertions that occurred in the repeated motif of exon II have been propagated to all corresponding positions within exon II. Thus the tandemly repeated motif appears to have undergone concerted, or horizontal, evolution. This observation suggests that intragenic unequal crossing over within the tandemly repetitious exon II of the PRP genes occurs more frequently than intergenic exchange, thus allowing the rapid spread of base substitutions to other repeats within the same gene. This analysis also predicts that exchanges between the RP13 and RP15 genes are very rare relative to the intragenic exchange, resulting in a rapid concerted evolution within the repetitious exon. This is also consistent with the observation of Lendahl et al. (1987) with regard to the Balbiani ring genes. Concerted evolution of a tandemly repeated motif via intragenic unequal crossing over may explain why repetitive DNA seems to be subject to evolutionary changes more frequently than nonrepetitive DNA. Concerted evolution in exon II is of note because it occurs within a single gene, rather than in an array of genes. One of the purposes of this study was to compare the promoter sequences of the rat and mouse PRP genes, with the expectation that sequences required for tissue-specific or inducible regulation should be conserved. This was made more difficult by the fact that there is not much information in the literature regarding functional elements of parotid-specific promoters compared with other tissues such as pancreas or liver. The sequence comparisons of exon I and the 5’-flanking region of RP4, RP13, RP15, and MP2 are presented in Fig. 8. There is at least 82% sequence conservation among the rat PRP promoters in the region from -250 to +l. The mouse gene is distinguished from the rat PRP genes by a 45-bp deletion corresponding to -72 to -116 of RP13 (Fig. 8). In the regions upstream from -250 bp, all the rodent PRP genes share at least 62% homology. However, five segments in the 5’-flanking region (corresponding to -726 to -710, -581 to -534, -493 to -477, -411 to -396, and -287 to -268 of RP13, Fig. 8) are greater than 80% identical among RP13, RP4, RP15, and MP2. These conserved segments in the 5’-flanking region may contain the element(s) controlling salivaryspecific and inducible gene expression. Gumucio et al. (1988) also proposed, from an analysis of mouse salivary amylase genes and mouse and human PRP genes, that several short sequence elements are associated with parotid-specific expression. However, the elements identified by Gumucio et al. (1988) are not evident in the rat PRP genes; likewise, the elements that we have identified do not agree with their se-

AND

ANN

quences. This may be due to different species-specific regulatory mechanisms. Recently, Cockell et al. (1989) attempted to pinpoint pancreas-specific promoter element(s) by examining the 5’-flanking sequences of pancreas-specific cu-amylase 2, trypsin a and d, elastase 1 and 2, and chymotrypsin A and B genes. The authors were unsuccessful in this attempt and suggested that the search for tissue-specific promoter elements may be complicated by the bipartite organization of these elements as well as the degenerate nature of the recognition motifs. Another reason it may be difficult to locate pancreas-specific promoter elements is that the pancreas-specific transcription factor (PTFl) contains two different subunits, each capable of interacting with a separate DNA domain (Cockell et al., 1989; Roux et al., 1989). We are currently assessing the role of 5’-flanking sequences of the rat PRP genes by direct biological tests and DNA-protein interaction assays. It is apparent that gene duplications played an important role in the evolution of the super multigene family of PRP and GRP. The similarities in exon/intron structure, tandemly repeated motifs, and nucleotide sequences among family members all support this point. Although the possible regulatory mechanism(s) of salivary-specific expression has not been resolved, the analysis of the rat PRP genes and their sequence comparisons point to specific examples of concerted evolution in the PRP genes. Similar events are likely to have occurred throughout the evolutionary development of the mammalian genome. ACKNOWLEDGMENTS

Somepreliminaryobservations wereinitiated whenD.A. wasin Dr. Don M. Carlson’s laboratory at the University of California, Davis, and supported by the Research Grant DK36812 (to D.M.C.) from the National Institute of Health. The authors thank Drs. Don M. Carlson and Reen Wu for many stimulating discussions and suggestions. The assistance from Dr. Ernest Retzel from the Department of Microbiology, University of Minnesota, with the computer sequencing data analysis is appreciated. This research is supported by grants from the National Institute of Dental Research, National Institute of Health (R29-DE09175), and Minnesota Medical Foundation to D.A.

REFERENCES ANN, D. K., AND CARLSON, D. M. (1985). The structure and organization of a proline-rich protein gene of a mouse multigene family. J. Bid. Chem. 260: 15863-15872. ANN, D. K., CLEMENTS, S., JOHNSTONE, E. M., AND CARLSON, D. M. (1987a). Induction of tissue-specific proline-rich protein multigene families in rat and mouse parotid glands by isoproterenol: Unusual strain differences of proline-rich protein mRNAs. J. Biol. Chem. 262: 899-904. ANN, D. K., GADBOIS, D., AND CARLSON, D. M. (1987b). Structure, organization, and regulation of a hamster prolinerich protein gene: A multigene family. J. Biol. Chem. 262: 3958-3963. ANN, D. K., SMITH, M. K., AND CARLSON, D. M. (1988). Mo-

SALIVARY lecular evolution of the mouse proline-rich family: Insertion of a long interspersed ment. J. Biol. Chem. 263: 10887-10893.

5. BENNICK,

A. (1982). 45: 83-99.

Biochem.

Salivary

proline-rich

GLAND-SPECIFIC

protein repeated

multigene DNA ele-

proteins.

Mol.

Cell.

6. BIRNBOIM,

H. C. (1983). A rapid alkaline extraction method for the isolation of plasmid DNA. In “Methods in Enzymology” (R. Wu, L. Grossman, and K. Moldave, Eds.), Vol. 100, pp. 243-255, Academic Press, New York.

7. BREATHNACH,

R., AND CHAMBON, and expression of euearyotic split Annu. Rev. B&hem. 50: 349-383.

a. BROWN-GRANT, mice 191:

treated with 1076-1078.

P. (1981). genes coding

K. (1961). Enlargement isoprotylnoradrenaline.

Organization for proteins.

of salivary Nature

gland in (London)

MULTIGENE

22.

LAWRENCE, C. B., AND GOLDMAN, D. A. (1988). and identification of homology domains. Comput. sci. 4: 25-33.

23.

LENDAHL, U., SAIGA, H., HSOG, C., EDSTROM, J.-E., AND WIESLANDER, L. (1987). Rapid and concerted evolution of repeat units in a balbiani ring gene. Genetics 117: 43-49.

24.

MAFDA, N., BLISKA, J. B., AND SMITHIES, 0. (1983). Recombination and balanced chromosome polymorphism suggested by DNA sequences 5’ to the human 6-globin gene. Proc. Natl. Acad. Sci. USA 80: 5012-5016.

25. Manual Department Houston,

10.

11.

12.

13.

14.

15.

CLEMENTS, S., MEHANSHO, H., AND CARLSON, D. M. (1985). Novel multigene families encoding highly repetitive peptide sequences: Sequence analyses of rat and mouse proline-rich protein cDNAs. J. Biol. Chem. 260: 13471-13477. COCKELL, M., STEVENSON, B. J., STRUBIN, M., HAGENBIICHLE, O., AND WELLAUER, P. K. (1989). Identification of a cell-specific DNA-binding activity that interacts with a transcriptional activator of genes expressed in the acinar pancreas. Mol. Cell. Biol. 9: 2464-2476. D’AMBROSIO, E., A., AND FUR.ANO, long interspersed Mol. Cell. Biol. 6:

WAITZKIN, S. D., WITNEY, F. R., SALEMME, A. (1986). Structure of the highly repeated, DNA family (LINE or LlRn) of the rat. 411-424.

Biology Biology,

Information Resource (1989). Baylor College of Medicine,

A. M., AND GILBERT, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavage. In “Methods in Enzymology” (L. Grossman and K. Moldave, Eds.), Vol. 65, pp. 499-560, Academic Press, New York.

27.

28.

MEHANSHO, H., HAGERMAN, A., CLEMENTS, S., BUTLER, L., ROGLER, J., AND CARLSON, D. M. (1983). Modulation of proline-rich protein biosynthesis in rat parotid glands by sorghums with high tannin levels. Proc. Natl. Acad. Sci. USA 80: 3948-3952. MEHANSHO, H., CLEMENTS, S., SHEARES, B. T., SMITH, S., AND CARLSON, D. M. (1985). Induction of proline-rich glycoprotein synthesis in mouse salivary glands by isoproterenol and by tannins. J. Biol. Chem. 260: 4418-4423.

29. MEHANSHO,

ANN, D. K., BUTLER, L. G., ROGLER, J., AND (1987). Induction of proline-rich proteins in glands by isoproterenol treatment and an inhibition by tannins. J. Biol. Chem. 262:

30.

BUTLER, L. G., AND CARLSON, D. M. (1987). and salivary proline-rich proteins: Interacand defense mechanisms. Annu. Rev. Nutr.

H., CARLSON, D. M. hamster salivary unusual growth 12344-12350. MEHANSHO, H., Dietary tannins tions, induction,

D’AMBROSIO, E., AND FURANO, A. V. (1987). DNA synthesis arrest sites at the right terminus of rat long interspersed repeated (LINE or LlRn) DNA family members. Nucleic Acids Res. 15:3155-3175.

7:423-440. 31. MIRELS, L., BEDI, G. S., DICKINSON,

DILELLA, A. G., AND Woo, S. L. C. (1987). Cloning large segments of genomic DNA using cosmid vectors. In “Methods in Enzymology” (S. L. Berger and A. R. Kimmel, Eds.), Vol. 152, pp. 199-212, Academic Press, New York. GILBERT, W. (1985). Genes-in-pieces revisited. Science 228:

32.

823-824.

33.

16. GROSVELD,

F. G., LUND, T., MURRAY, E. J., MOLLER, A. L., DAHL, H. H. M., AND FLAVELL, R. A. (1982). The construction of cosmid libraries which can be used to transform eukaryotic cells. Nucleic Acids Res. 10: 6715-6732.

34.

17.

GUMUCIO, D. L., WIEBAUER, K., CALDWELL, R. M., SAMUELSON, L. C., AND MEISLER, M. H. (1988). Concerted evolution of human amylase genes. Mol. Cell. Biol. 8: 1197-1205.

35.

18.

HANAHAN, D., AND MESELSON, M. (1983). Plasmid screening at high colony density. In “Methods in Enzymology” (R. Wu, L. Grossman, and K. Moldave, Eds.), Vol. 100, pp. 333-342, Academic Press, New York.

36.

19.

of Molecular of Cell TX.

HEINRICH, G., AND HABENER, J. J. (1987). Genes encoding proteins with homologous contiguous repeat sequences are highly expressed in the serous cells of the rat submandibular gland. J. Biol. Chem. 262: 5262-5270.

20. HENIKOFF, ase III

S. (1984). Unidirectional creates targeted breakpoints

digestion with exonuclefor DNA sequencing.

Gene28:351-359. 21. KIM, H.-S.,

AND MAEDA, N. (1986). Structures type genes in the human salivary proline-rich gene family. J. Biol. Chem. 261: 6712-6718.

of two ZZacIIIprotein multi-

Definition Appl. Bia-

26. MAXAM,

9. CARLSON,

D. M., ANN, D. K., AND MEHANSHO, H. (1986). Proline-rich proteins: Expressions of salivary multigene families. In “Microbiology” (American Society for Microbiology), pp. 303-306.

113

FAMILY

37.

38. 39.

D. P., GROSS, K. W., AND TABAK, L. A. (1987). Molecular characterization of glutamic acid/glutamine-rich secretory proteins from rat submandibular glands. J. Biol. Chem. 262: 7289-7297. MOUNT, S. M. (1982). A catalogue of splice junction sequences. Nucleic Acids Res. 10: 459-472. ROSENBERG, S. (1985). EcoK restriction during in vitro packaging of coliphage lambda DNA. Gene 39: 313-319. Roux, E., STRUBIN, M., HAGENB~CHLE, O., AND WELLAUER, P. K. (1989). The cell-specific transcription factor PTFl contains two different subunits that interact with the DNA. Genes Deu. 3: 1613-1624. SANGER, T., NICKLEN, S., AND COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. SCHNEYER, C. A. (1972). Regulation of salivary gland size. In “Regulation of Organ and Tissue Growth” (R. J. Cross, Ed.), pp. 211-232, Academic Press, New York. SOURES, M. B., SCHON, E., AND EFSTRATIADIS, A. (1985). Rat LINEl: The origin and evolution of a family of long interspersed middle repetitive DNA elements. J. Mol. Evol. 22: 117-133. SOUTHERN, E. M. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98: 503-517. ZIEMER, M. S., SWAIN, W. F., RUTI‘ER, W. J., CLEMENTS, S., ANN, D. K., AND CARLSON, D. M. (1984). Nucleotide sequence analysis of a proline-rich protein cDNA and peptide homologies of rat and human proline-rich proteins. J. Biol. Chem. 259: 10475-10480.

Molecular characterization of rat multigene family encoding proline-rich proteins.

Three members of the rat proline-rich protein multigene family have been characterized. Each of these genes, RP4, RP13, and RP15, contains three exons...
2MB Sizes 0 Downloads 0 Views