Plant Molecular Biology 18: 581-584, 1992. © 1992 Kluwer Academic Publishers. Printed in Belgium.
581
Update section Short communication
Molecular identification of a soybean protein kinase gene family by using PCR Xin-Hua Feng, 1"2, Paul J. Bottino 2'3 and Shain-dow Kung 1 , 2 , 1Center for Agricultural Biotechnology (* author for correspondence)," 2Department of Botany and 3Maryland Agricultural Experiment Station, University of Maryland, College Park, MD 20742-5815, USA Received 30 August 1991; accepted 2 September 1991
Key words: gene family, polymerase chain reaction, protein kinases, signal transduction, soybean (Glycine max L.)
Abstract In this study we report identification of six members of a protein kinase gene family from soybean (Glycine max L.). Two fully degenerate oligonucleotide primers corresponding to two conserved motifs (DLKPENV and G T H E Y L A P E ) in the catalytic domains of eukaryotic protein serine/threonine kinases were used in a polymerase chain reaction (PCR) to amplify soybean cDNA. Sequence analysis showed that 28 of the P C R sequences represented six different putative protein serine/threonine kinases. These resuits not only demonstrate that catalytic domains of protein kinases are highly conserved between plants and other eukaryotes but also suggest that there are multiple genes encoding protein kinases in plants.
Phosphorylation and dephosphorylation of proteins play a pivotal role in regulating key cellular functions in eukaryotic organisms. In animals and lower eukaryotes, protein kinases have been characterized in detail both at the biochemical and genetic levels [6, 9]. However, the corresponding pathways in plants where protein kinases are involved have not been clearly delineated. In order to understand the molecular basis of signal transduction in plants, it is necessary to identify the molecular components and intracellular targets that are responsive to environmental signals. A necessary step towards this goal is to study the role of protein kinases. Based on the
observation that individual protein kinases have structural conservation to their analogous functions [6], molecular cloning techniques can be used to obtain genes encoding protein kinases and to dissect these genes for their specific functions. Particularly the use of PCR has become a popular method for isolating additional members of protein kinase gene family [5, 7, 11]. In this study we applied P C R to isolate c D N A sequences encoding protein kinases from soybean. We used fully degenerate oligonucleotide primers in PCRs on soybean cDNA. The primers PK-A and PK-B correspond to two closely spaced amino acid sequences that are highly con-
582 served in the catalytic domains of eukaryotic protein serine/threonine kinases [5, 6]. They represent two of the eleven subdomains that are thought to be involved in phosphorylation activity and, therefore, would be expected to be present in all protein serine/threonine kinase sequences. The sequence of forward primer PK-A was derived from the amino acid sequence DLKPENV and reverse primer PK-B from the amino acid sequence G T H E Y L A P E [5]. The distance between the PK-A and PK-B primer pair in a c D N A would be ca. 140 to 370 bp [5]. Amplification of soybean c D N A with these primers indeed produced several visible bands at and between them (Fig. 1). Cloning of these PCR products into an M13 vector and subsequent sequence analysis of inserts revealed that there were 28 PCR c D N A sequences from which 6 amino acid sequences were identified that are characteristic of eukaryotic protein kinases (Fig. 2). Specifically, these sequences (designated GmPK1, GmPK2, GmPK3, GmPK4, GmPK5, and GmPK6 in an order of decreasing length) contained a highly
Fig. 1. Fractionation of PCR products. Poly(A) RNA was prepared from entire 7-day-old seedlings of soybean (Glycine max L. cv. Essex) with oligo(dT) cellulose by using a FastTrack mRNA isolation kit (Invitrogen). Reverse transcriptionPCR (RT-PCR) was performed using the protein kinase primer pair PK-A and PK-B exactly as described [5]. PCR products (lane 1) were separated by electrophoresis on a 1.5% NuSieve/1.0% SeaPlaque agarose gel and visualized by staining with ethidium bromide. Lane 2:~X174 RF DNA/Hae III marker (BRL).
conserved tripeptide D F D or D F G (single-letter amino acid code) found in all protein kinases. Choices of primers and sequence similarity to protein serine/threonine kinases suggest that these six clones are representatives of six members of the family. Among our cloned kinase sequences, GmPK1, GmPK2, and GmPK3 contain similar sequences between primer regions. The sequences are conserved and contain a highly basic region and a cysteine-rich region in the form of tandem repeats o f C X z _ 3 P ( X for any amino acid residue) (Fig. 2). We assign the term 'BCR (basic and cysteinerich) region' to this unique sequence. Similar B CR regions are found in five protein kinase sequences of rice [5, 10], one of bean [10], maize [2], and pea [ 11 ]. One noticeable feature is that these BCR regions share similarities to sequences necessary for DNA-protein interactions: a cysteine-rich metal-binding motif [4, 14] (also found for protein-protein interactions [1]) and a nuclear localization sequence consisting of basic amino acid residues [ 3, 13 ]. Whether the B CR regions in these protein kinases are involved in similar functions needs further investigation. Although similar kinases from most plant species have not yet been isolated, it is likely that they represent a new subfamily of protein kinases that exist across the plant kingdom if not in other eukaryotes. GmPK5 is identical to the region between subdomain VI and VIII in the catalytic domain of the soybean CDPK, a calcium-dependent protein kinase with a calmodulin-like regulatory domain [8], which may function to regulate the activity and structure of the plant cytoskeleton [12]. We have obtained the full-length sequence of GmPK6, which allows us to characterize further this protein kinase sequence (data not shown). GmPK6 shows the absolute conservation of eleven subdomains in the catalytic domain known to be involved in kinase activity, strongly suggesting that GmPK6 is indeed a protein kinase ([6]; data not shown). In summary, we have isolated six members of protein kinase gene family in soybean by using PCR; five of these represent novel members of the family. It should also be noted, however, that the
583 PK-A
*
*
*
GmPKI
1
D LKP ENVL ---VRDDGH
IMLSD F D L S LRCAVS P TLVKT S S TD S EP LRKNSAYCVQPAC
IEP P S-
GmPK2
1
DLKPENVL---VRDDGH
IMLSDF D LSLRCAVSPTLIRTSYDGDP
SKRAGGAFCVQPAC
IEP S SM
GmPK3
1
DLKPENVL---VRDDGH
IMLSD FD LSLRCAVSPTLIR-NFDSDP
SKRGGGAFCVQPAC
IEP S SV
GmPK4
1
DLKPENVL---VRSDGH
IMLSDFD
GmPK5
1
DLKPENVLFDT
IDEDAKLKATDF
GmPK6
1
DLKPENVL---
INEDNHLKIAD
Consensus Subdomain
*
G LSVFY .................
..... P .................
*
¥
SKSKKDRKPKTE
SP P SP
KPGESFC ...........
F G ...............................
L...VR.DGHIMLSDFDLSL VII
VI
*
LSLCSDAIPAVESPDCSLDPA--FAPALRYTRQYS
TACEEASC-
A.C...AC.EP..
PK-~
61
C IQP SCVAPTTCFSPRLFS
IGNQ-VSPLPELIAEPTDARSMSFVGTHEYLAPE
62
CIQP ...... ACF IPRLFPQKNKKSRKPRADPGLP-
S STLPELVAEPTQARSMSFVGTHEYLAPE
61
CIQPS ...... CFMP RLFAQKNKKS
S S TLPE LVAEP T TARSMSFVGTHEYLAPE
60
A .......................
RTPKAEPGMP-
S RTAYS GPGRSDP P TQPP LRGRTGGARSC
37
.....................................................
29
......................................... .........................
SFVGTHEYLAPE DVVGTHEYLAPE
DLLADDP ........ GTHEYLAPE
124 119 118 i01 48 45
R . . . . . . G . . . . . . . P . L . A .... A R S . S F V
Subdomain
VIII
Fig. 2. Alignment of deduced amino acid sequences of soybean protein kinases. Amplified D N A was ligated into the M13mp10 vector and used to transfect Escherichia coil strain DH5~F'. Individual clones (GmPK1 to 6) were sequenced using the Sequenase kit (United States Biochemicals). The deduced amino acid sequences of GmPK1 to 6 were aligned to maximize homology. A consensus sequence containing the same residues at corresponding positions in at least 4 of the 6 sequences is shown at the bottom of aligned sequences. Subdomains, as designated [6], are shown below the aligned kinase sequences. The arrow shows the position corresponding to the putative autophosphorylation site of cAMP-dependent protein kinase [6]. Asterisks indicate conserved cysteine residues and the top double line indicates the basic amino acid-rich region appearing in the first three sequences. The sequences of the primers (PK-A and PK-B) are shown but not included for consensus sequences.
c D N A clones from PCR analysis we have isolated may not represent the full complement of the protein kinase family. The isolation of c D N A s relies heavily on the use of primers from subdomain VI and VIII and the preparation of c D N A from 7-day old light-grown seedlings. It thus biases our clones for homology with these regions in the protein kinases encoded by expressed mRNA. Therefore, estimates of gene number should be considered as minimal. As the multiplicity and complexity of protein kinase genes also exist in other plants such as rice [5] and pea [ 11 ], we anticipate that further characterization of c D N A clones using different primers will identify additional members of the protein kinase multigene family in plants and that plant cells will exhibit significant diversity at the level of expression of these proteins and at the level of regulation of protein kinase activity. Thus, the molecular identification of the genes likely to encode a large family of protein kinases in plants and further research aimed at studying the biological func-
tions of the protein kinases should provide initial insight into the underlying mechanism of signal transduction or metabolism regulation in plant cells.
Acknowledgements We would like to thank John Watson and Steve Wolniak for stimulating discussion.
References 1. Berg JM: Zinc fingers and other metal-binding domains. J Biol Chem 265:6513-6516 (1990). 2. Biermann B, Johnson EM, Feldman LJ: Characterization and distribution of a maize c D N A encoding a peptide similar to the catalytic region of second messenger dependent protein kinases. Plant Physiol 94:1609-1615 (1990). 3. Chelsky D, Ralph R, Jonak G: Sequence requirements for synthetic peptide-mediated translocation to the nucleus. Mol Cell Biol 9:2487-2492 (1989).
584 4. Evans RM, Hollenberg SM: Zinc fingers: Gilt by association. Cell 52:1-3 (1988). 5. Feng XH, Kung SD: Diversity of the protein kinase gene family in rice. FEBS Lett 282:98-102 (1991). 6. Hanks SK, Quinn AM, Hunter T: The protein kinase family: Conserved features and deduced phylogeny of the catalytic domains. Science 241:42-52 (1989). 7. Haribabu B, Dottin RP: Identification of a protein kinase multigene family of Dictyostelium discoideum: molecular cloning and expression of a c D N A encoding a developmentally regulated protein kinase. Proc Natl Acad Sci USA 88:1115-1119 (1991). 8. Harper JF, Sussman MR, Schaller GE, Putnam-Evans C, Charbonneau H, Harmon AC: A calcium-dependent protein kinase with a regulatory domain similar to calmodulin. Science 252:951-954 (1991). 9. Hunter T: A thousand and one protein kinases. Cell 50: 823-829 (1987).
10. Lawton MA, Yamamoto RT, Hanks SK, Lamb CJ: Molecular cloning of plant transcripts encoding protein kinase homologs. Proc Natl Acad Sci USA 86:3140-3144 (1989). 11. Lin X, Feng XH, Watson JC: Differential accumulation of transcripts encoding protein kinase homologs in greening pea seedlings. Proc Natl Acad Sci USA 88: 69516955 (1991). 12. Putnam-Evans CL, Harmon AC, Palevitz BA, Fechheimer M, Cormier MJ: Calcium-dependent protein kinase is localized with F-actin in plant cells. Cell Motil Cytoskel 12:12-22 (1989). 13. Silver PA: How proteins enter the nucleus. Cell 64: 489497 (1991). 14. Vallee BL, Coleman JE, Auld DS: Zinc fingers, zinc clusters, and zinc twists in DNA-binding protein domains. Proc Natl Acad Sci USA 88:999-1003 (1991).