Nucleic Acids Research, Vol. 18, No. 18 5559
©-:) 1990 Oxford University Press
Nucleotide sequence of the cellobiohydrolase Trichoderma viride
from
gene
Cheng Cheng, Norihiro Tsukagoshi and Shigezo Udaka* Faculty of Agriculture, Nagoya University, Chikusa, Nagoya 464, Japan Submitted August 9, 1990
EMBL accession
The filamentous fungus, Trichoderma viride, secretes large amounts of several cellulolytic enzymes (1). We report here the nucleotide sequence of the gene coding for exo-cellobiohydrolase (CBH), which accounts for more than 65 % of the total cellulase content. This gene was isolated from a T. viride genomic library using a 30mer oligonucleotide based on the sequence of the reported CBHI gene of T. reesei (2). The amino acid sequence deduced from the nucleotide sequence is highly homologous (95%) to that of the T. reesei CBHI gene. In contrast to the high homology between the enzyme coding regions of the two CBH genes, 67% and 30% homology was found between the 5'-non
GTAGCATTGAGGAACCACTGCTTTTGCCAGGAGCTAGCGTATGCTGTAGGCAAAGCTCTA
AGATAGCCTCATTGAGCGGAAGTCGGCGAACAGTGAAGACCAGAATATCACAT
180 ATATATA 240
TGGCCCAAACGCCGTGTCCCCTTCTCCCTTTCCCCATCTACTCATCAACTCAGATCCTCC
AGAAGACTTCTACATCATCTTTTCGCCCATACCATTCTAGTCGACTACGGACTGCGCATC ATGTATCAAAAGTTGGCCCTCATCTCGGCCTTCTTGGCTACTGCTCGTGCTCAGTCGGCC -17 4
24
44 64 84
104 124 138 142 162
182 202 222
300
REFERENCES 1.
242
262
360 282
TcCACCCTCCAGcGCGAAACTCACCCGCCTCTGACATGGCAGAAATGCTCATCG(iC S S G K C W P L
480
302
ACTTGCACCCAACAGACAGCCTCCCTGGTCATFCGACUCGAACTCGCTGGACTCACGCC H A W T W R A N I D T C T Q Q T G S ACCAACAGCAGCACCAACTGCTACCACGGCAATACTTGCAGCTCAACCCTCTGCCCTCAC D
540
322
600
342
AATGAGACTTGCGCCAAGAACTGCTCCTTCGACGGTGCTGCCTACGCCGTCCACGTACGCA D L C C G A A Y A S T Y G E T C A K GTCACCACGAGCGCTGACAGCCTCTCCATTGCCTTCGTCACTCAGTCTGCGCAAAAGAAC I S S L D G F V T Q S A Q K N A V T T S GTCGGCGCTCGTCTCTACTTGATCGCGAGTCACACGACCTATCAAGAATTCACCCTGCTT
660
362
720
370
M
Y
r
T
T
Q
S
N
L
A
0
A
E T
S
T
N
C
S
I
F
V
V
D
G
N
T
T
A
R
A
QSA
Q
T
P
Y
A
W
S
C
S
T
L
C
G
A
R
L
Y
L
M
A
S
D
T
T
Y
Q
E
F
T
L
E
N
F
S
F
D
V
D
V
S
Q
L
840 900
421
960
441
C
H
1020 461
K
1080 481
V
G
G
I
G
H
G
S
P
H
C
C
S
E
M
D
V
G
I
W
E
A
N
S
TCTGAGGCTCTTACTCCTCATCCTTGCACGACCGTCGGGCAGCAAATTTGCGACGGTGAC D S
E
A
L
T
P
C
T
T
Q
E
I
C
E
G
1140
I
1200
et
on
Enzymc Hydrolysis of Cellulose
81-109.
pp.
al. (1983) BiolTechnology 1, 691-696.
TCTTGCGGTGGAACCTACTCGGGTGACCGATATGGCGGTACTTGCGACCCTGATGGCTGC G C
1260
GATTGGAACCCATATCCCTTGGGCAACACCAGCTTCTATGGCCCCGGCTCCAGCTTCACG P G S S F T N T S F Y D W N P Y R L CTTGACACCACCAAGAACTTCACCCTCGTCACTCAGTTCGAGACTTCGGGTGCCATCAAC T S F E G A I N T Q K L T V T K D T L CCATACTATGTCCAGAATGCCGTCACTTTCCAGCACCCCAACCCCGACCTCCCTCATTAC Y
1320
S
G
C
T
Y
S
R
D
C
Y
G
C
T
D
C
D
P
G
C
C
1380
V
G V T F Q Q P N A E L C D TCTGGCAACTCGCTCCACCATCACTACTGCGCGGCTGAACAGCCCGAGTTTGGCCCCTCC R
Y
Y
V
Q
N
S
C
N
S
L
D
D
D
C
Y
A
A
E
E
E
A
F
C
C
K
C
1500 1560
V
C
C
1440
S
C
401
C
V
al., eds),
CTCCGCTCTTCCTGCTCACCTCGAGTCCAACTCTCCCAACGCCAACGTCGTATACTCCAA G S
P
gATGTGTTGAAC cttgatgccattctcgtattgatttctcgtgactagcttatttaa N L C GGACCTCTTTACTTCGTGTCCATCGACGCGGATCGTGGCCTGACGAAGTATCCCACCAAC T N Y P V T K G D A D S L Y F G A ACTGCCCGTGCCAAGTACGGCACGGCGCTACTCTGACAGCCAGTGCCCTCCTGATCTCAAG T A G A K Y G T G Y C D S Q C P R D L TTCATCAACGGCCAGGCCAATGTTGAGGGCTGGGAGCCGTCCTCTAACAATGCAAACACG E G W E P S S N N A N T F I N G Q A N GGCATTGGCGGACATGGAAGCTGCTGCTCTCAGATGGATATCTGGGAGCCCAATTCCATC
al., (1975) in Symposium
et
381
L
GGCAACGAGTTCTCTTTCGATGATGTTTCGCAGCTGCCgtaagtgaccaactacact G
780
et
(Bailey,M.
TCTTTCTCGCACAAGGGCCCCCCTTACTCAATTCAACAAGGCTACTTCCGGTCCCATCGTC M T S K A Q F L T D K F S S CTGCTCATGACCCTG TGGGATGACgtgagtttcaagaat taacattcacattgtcaacng V M S L W D L aatgactagactgatggagagacgatagTACTACGCCAACATGCTGTGCCTGGACTCTAC N Y A Y M L W L D S CTACCCGACGCACGAGACCTCCTCCACCCCCCGTCCCTCCCTGCAACCTCCTCCACCAG S T G S S
P
N
N
V
L
K
L
Mandels,M.
2. Shoemaker,S.
420
A
X53931
coding and 3'-non coding regions, respectively. Southern-blot and Northern-blot analysis showed the presence of at least one CBH gene on the T. viride chromosome which was transcribed only at the culture condition that cellulase production was induced.
120
GGTGCCACTGCATTTCTGTCGAACATAATGTGATGCTTCGGCAGGCATAATAGCCGCCAA
no.
1620
D
1680
T
Y
P
T
V
S
D
P
E
A
T
Q
S
L
S
E
T S
P
N
C
S
A
P
V N
S
R
A
K
V
V
C
C
C
1860
C
1920
P
C
TACTCACACGCACTATGGCCAGTCCGGTGGAATTGGGTACATCGGGCCCACCGTCTGCGC C G G I G Y I G P T V C Q H Y Q T T GACTCCCACCACTTGCCAGGTCCTCAACCCCTACTACTCTCAGTGCTTCTAAGGTACTCT Q C Y S P Y L N L *1* Q S T C S GGCAACTCCACATACCTAGCTCGAAAGCTTGAGCATCTGCTGGCTTATGGATGAGTTCAT CTCATTATGCACTAGATGGACCATTTACTTTGCTGTATCTACTTCTCACGCTTCCAATAT
1980
A
C.
C
1800
N
Y
CATCAACTTCCCCCCCATCCCCACCACCGGCAACCCTAGCGGCGGAAACCCTCCTGGCGG G T G N P P S P I I F G N P S AAACCCTCCCCCCACCACAACACCGCGCCCGCCTACCTCCACTCCAAGCTCTCCCGGCCC S S P G T G T S P A P R T T T P N P K
1740
2040
V
ATACGCTTATTTCACCTTTGCTCCAATGCTCCCTACCTTGGCAAGCACGGCTTTCGAGAG ACACGCACTGATTCTCTCCTAACTATGCATTATATAAGACTGAAATAGACCAAAAAACGA AAACCTTCCCCACTCCAATTATCTCACGGTGTTCATTTATATGTAGGCCA
2100
2160 2220
2280
2330
of CBH. The deduced amino acid sequence is shown below the coding region of the gene and numbered, on the left, from the amino-terminus (I and II) are shown in lower-case letters. The putative TATA box sequence is underlined. The amino acid sequence of the amino terminal region of the purified enzyme, which was determined chemically, is double underlined. The stop codon is indicated by asterisks.
Nucleotide
sequence
(Gln) of the
*
mature enzyme. Introns
To whom correspondence should be addressed