BIOCHEMICAL

Vol. 178, No. 3, 1991 August

AND BIOPHYSICAL RESEARCH COMMUNICATIONS Pages 1072-l 077

15, 1991

PARTIAL

STRUCTURE OF THE HUMAN H-PROTEIN GENE

HIROHISA KOYATA AND KOICHI HIRAGA THE DEPARTMENT OF BIOCHEMISTRY, TOYAMA MEDICAL AND PHARMACEUTICAL UNIVERSITY TOYAMA

Received

June

18,

SCHOOL OF MEDICINE,

JAPAN

930-01,

1991

showing an SUMMRY In a span of approximately 13 kb, two genomic fragments obvious overlap encoded the 1,192 base pair (bp) cDNA sequence for human Hprotein. Because of the close similarity in the genomic organization comprised of five exons to that for the chicken H-protein gene, this region Primer extension analysis was assigned to the true H-protein gene in human. suggested and Sl protection analysis confirmed that at least one additional exon for the 14 bases-long 5' untranslated region in H-protein mRNA exists as the presumable first exon. None of the 5.0- and 5.5-kb Sac1 and the 5.2-kb EcoRI fragments which were undetectable in the genomes ofpatients with nonketotic hyperglycinemia was included in the true H-protein gene. 0 1991 Academic

Press,

Inc.

Human hydrogen of the

glycine

of the

intermediary

the

stimulatory

this

enzyme

protein

carrier

cleavage

system

product protein

system

as well

and is

as those

nonketotic

hyperglycinemia hyperglycinemia,

(5),

two -EcoRI has demonstrated

the from

the

more

human H-protein

gene.

This

this paper

fact

to the aberrant

that

the

H-protein

by the

issue,

loci

for

the

we attempted

reports gene

1072

found

includes in the

none patient

and as of

of H-

results

in

of patients

with

24L and 24s by us

undetectable

5.5-

copies prevented

aberrations

to characterize

on the organization

fragments

activity

multiple genome

above

carrier

component

cDNA cloned

However, human haploid

the

hyperglycinemia. 0006-291X/91 $1.50 Copyright 0 I991 by Academic Press, Inc. All rights of reproduction in any form reserved.

genomes both

(2)

another

(5).

in

this

is

and T-protein

identified

fragments

components

as the

defective

using

longest

four

and synthesis that

In the

of genomic

into

functions

(3,4),

analysis

existing

verification insight

of its cleavage

structures

EcoRI

one of the

decarboxylase

of the

aberrant

is

decarboxylase

Southern fragments

gene and on the

corresponding

glycine by itself

of glycine

cDNA sequence

accurate

To gain protein

inactive

Sac1 or 5.2-kb

H-protein

the

glycine

Because

in man (1,5,6).

the

and 5.0-kb

(H-protein)

(1).

in

for

nonketotic probes,

protein

(5). the

of the

true

of fragments with

of us

nonketotic

H-

BIOCHEMICAL

Vol. 178, No. 3, 1991

AND BIOPHYSICAL

EXPERIMENTAL

RESEARCH COMMUNICATIONS

PROCEDDRES

Isolation of Genomic Clones The two H-protein cDNA fragments, the 5' 800 bp 24L and 3' 300 bp 245 which are produced at the internal (51, were -EcoRI site nick-translated and were either simultaneously or separately used as probes. Aliquots of a human genomic library constructed by using )\DASH vector (Stratagene Cloning Systems) (7,8) were subjected to a couple of selections of clones including the human H-protein cDNA sequence by the method as described previously (5). This library contained a strongly amplified genomic clone that has a processed H-protein gene. To avoid the selection of this clone, and to obtain a clone encoding the 5' region of the cDNA, a fragment produced at the 5' -EcoRI and *I sites of the 24L cDNA (see Fig. 1 of Ref. 5) was also employed as a probe for the screening. Isolated genomic fragments were subcloned by using Bluescript plasmid vectors (Stratagene Cloning Systems). Dm Sequencinq Nucleotide sequence was determined on the subcloned plasmid by the method of Sanger et al . (9) using 7-deaza-dGTP instead of dGTP (10). Promoter sequences for T7 and T3 RNA polymerases on the vector DNA were used for priming sites. Oligodeoxynucleotides (17-mer in size) identical or complementary to segments of the human H-protein cDNA sequence were synthesized by using a DNA synthesizer, model 381A (Applied Biosystems Inc. Japan, Tokyo), and were also employed as primers. Primer Extension Analysis An oligonucleotide complementary to nucleotides 420 of the cDNA sequence was synthesized, 5' end-labeled, and annealed with human liver total RNA. The H-protein mRNA sequence was transcribed by using reverse transcriptase from Moloney murine leukemia virus (Bethesda Research Laboratory) essentially according to the method of Calzone et al. (11). Sl Protection Analysis A genomic subclone from XHHG102, pHHG102EcoRI/PvuII (see Fig. 1 and 21, which codes for nucleotides 1-188 of the cDNATth a preceding 500-bp genomic sequence, was linearized at the EcoRI site (5' end of this insert). Part of this genomic fragment was replicatxby both E . coli DNA polymerase I Klenow fragment and a 5'-end-labeled synthetic DNA primer complementary to nucleotides 90-106 of the cDNA sequence. DNAs were denatured with 0.2 N NaOH followed by 3.5%-polyacrylamide gel electrophoresis in the absence of urea. The single-stranded and radioactive product (about 600 bases) was recovered by diffusion into 1 ml of 10 mM Tris-HCl buffer (pH 7.5) containing 0.1 mM EDTA. The phenol extracted DNA probe was annealed with human liver total RNA and treated with 5 unit/ml of Sl nuclease. Sizes of DNA fragments were determined by electrophoresis, using a polyacrylamide gel containing 7 M urea, followed by autoradiography.

RESULTS Orqanization

of

cDNA, we have partially

obtained

to the

&HG9

appeared kb in

hybridized

to

with

size)

hybridizes

to both alone

involve

(5). by that

clones

5 '-end

the

segments

This

to 24L.

differ

the

tiHG102

6.4-kb

and 1.3-kb

fragments

6.4-

and 1.3-kb

fragments,

Although

the

7.8-kb

analysis

1073

fragment in

the

sites, genomic

two

fragment is

are

and

4 kb (Fig.

fragments

(7.8

was also

a polymorphic produced.

and 24s to the seems to have previous

insert

These

of approximately

1.3-kb there

by their

and ApaI

cDNA.

AHHG9 by SacI,

of H-protein

which

__ EcoRI

of the

overlap

from

human genome,

Southern

segments

Among them,

an obvious

In the the

between

formed

different

of genomic maps.

downstream

hybridized

245.

consecutive

underestimated

types

fragments

at which fragment

several

to more

Using

Gene

restriction

Among several

and 1.3

H-Protein

cDNA fragment

insert

fragments 1).

True

characterized

hybridized the

The

study,

-Sac1 site The 24L probe 1.3-kb been the SacI

site

Vol.

BIOCHEMICAL

178, No. 3, 1991

AND BlOPHYSlCAL

RESEARCH COMMUNICATIONS

A

*kb

______

-~--~----~-~-“-“--

B

0 1

EmA

ExonD

ExonE

Organization of the human H-protein gene. ?& Recognition sites for Fiq. 1. restriction endonucleases in the genomic region assigned to the true H-protein The recognition sites for -EcoRI, E; GdIII, H; *I, S; SacI, Sa; gene. X; are shown for the XHHGlO2 (upper row) and XHHG9 (lower @I, Sm; and =I, row) inserts. Filled circle denotes the polymorphic Sac1 site suggested in g. The distribution of exons inthe genomic region i-previous paper (5). Exons illustrated between the EcoRI and &I sites shown with asterisks in &. with closed boxes are tentatively designated A to E, because the 14-bp unknown Pv and Hc untranslated region is predicted for H-protein mRNA (cf. Fig. 2). The indicate the recognition sites for -PvuII and GcII, respectively. regions upstream and downstream from the EcoRI site in exon E are separately hybridized with 24L and 24S, respectively. Fig. 2. The sizes of both the primer extended and the Sl protected products. Byg two 17-mer primers, each complementary to nucleotides 4 to 20 of the cDNA (lane 1) and -23 to -7 of the genomic sequence (cf. Fig. 3)(lane 2), and liver total RNA (5), H-protein mRNA was examined for the unknown 5' region. The single-stranded DNA illustrated under the autoradiogram was prepared as described in EXPERIMENTAL PROCEDURES. Products from reactions for 7 (lane 5) and 15 min (lane 6) with Sl nuclease (5 units/ml) were determined for their sizes togetherwith that of the probe after annealing with the RNA (zero time control, lane 4). Lane 3 is for the end-labeled WI-treated pBR322 DNA.

between

the

assigned

7.8-

to the

hHHG102

a 12-kb

insert,

even when

However, hybridization

As would

inserts

exhibited

whereas

the

reported

base

not

a genomic

remaining

sites cDNA are of the

separate

yielded from

tested

boundaries, 3). in the replaced 21st

each

insert

facts,

both

comprised carried

cDNA sequence

(Fig.

in the

segments

in

these

organization

clones

substitutions

154T of the substitution

were

exon/intron

and acceptor

by using

be expected

observed

present

study

was

(Fig. 1). In the -SacI-treated -Sac1 site was detected only by a 5' region of the

fragment

human H-protein

At all

fragments

probed

signals

shown).

four

and 1.3-kb

above polymorphic

the

(not

Compared

with

with

Leu in the

1074

&IHGlOZ

of five

exons

sequences

clones

(not

and AHHG9 (Fig. similar

1 B), to

the

shown). the

is

conserved

for

cDNA sequence,

elucidated.

T and C, respectively, deduced

cDNA, multiple

of remaining the

processed

consensus

exon sequence

of the

cDNA.

mitochondrial

splice there

Nucleotides resulting presequence

donor are

only

69C and in the to Ser

Vol.

178,

No.

in the acid

3, 1991

genomic

sequence.

change.

Unknown

a 96-bases

was reverse-transcribed

product

of

to nucleotides

mRNA.

the Fig.

with

size

unknown

total

than

the

In this

shorter the

context,

than

consensus

for

(Fig.

3, -2 and -1,

bases

in

size)

splice

and -46

are

further

-23

in the

sequence,

(Fig.

ply(A) this

2, lane

(96 bases)

Considering

from

normal tail

upstream

2).

This 28-bases bases

H-protein

long

5'

mRNA (1.4

to -1,

the

100-200

in the

400-bases

H-protein

genomic

segment

exon was assigned

which

value

is

in (5))

to the

far

downstream

long

poly(A)

addition in

A

z

A

231 E

:i::

311 x

Exon B TTTrGcAcAG&attggattatatt----2:; :: 8

375 Hu Ch

Exon c AAcAAAcAAGgtgagtgttcttagg----95N K Q D 86N K D D

-Y+

::

taagcgcggcgggc-----

Exon D 507 TATGAAGI4 taagctgttgctag----Hu 139Y E D l-s Ch 130Y Q D G 1121

-GCAAAATMT-TA

Exon/intron Fig. 3. and a-case letters both 5' and 3' regions. corresponding to the is referred with minus end of exon E is the consensus for splice

5.3

2.3

2.5

1.5

kb

kb

kb

kb

probe

fragment l-106

suitable the

for

genomic

the

longer

the

a predictable

of the

the

-----tcttgttttattta

-----tttttttccaCttxA2AG;?G

-----tcttttgttcggc~~~T

exceeds

5' end of the

5' end of the

CTCCGGCCGCGAAC

-----ggtcttctgtttt~~Tf.A~

-33

39-bp

mRNA having

cDNA sequence

A

Bxon

B

R

K

20 250 F

BXcm c GAA;CGf2 E A I, Rmn

D

E

F

Exon

E

w

I,

cDNA

330 G 334 G 526 I

Bxon E ~aatcatgtttgttttgatgttaatatttcatttagta

Upperboundary sequences of the human H-protein gene. and intron and flanking sequences at indicate exon, Numbering of nucleotides begins at the position toward the downstream, and 5' end of the cDNA for exons The 3' to the 5' upstream from the 5' end of exon A. The site at which !&y(A) region is joined in the mRNA. sites and an AG dinucleotide (-46, -45) are underlined.

1075

DNA

mRNA sequence.

in mRNA, and the

of the

and 152

of about

size

of the possible

far,

the

two AG

a size

than

of

sequence

a synthetic

to 1,192-bp

2 bp upstream

contained

a primer,

ttaagctctgtcccgcccccgcggcaccgcctccgcgcctccatccaaccGGcTcc Bxon

the

a cDNA with

Thus

HWhen

(108

for

(5),

RNA, in

two fragments

EXO" -48

in

in

region

size.

poly(A)+

probe

are between

produced

cDNA sequence

kilobases

present

When used

dinucleotides

A

exon A as nucleotides

Therefore,

examined.

1).

RNA,

analysis.

two AG dinucleotides are

total primer

By the incubation fragments (lanes 5 and

included

in

amino

exists

Sl protection

sequence

and -45).

to nucleotides

400 bases

from

several

site

complementary

product

human liver of 76 bases

produced

Moreover,

presumable

2, lane

(Fig.

from

region

consecutively

2).

without

sequence.

RNA, the single stranded-DNA 200 bases (lane 4).

nucleotide

exists

(Fig.

the

C, but

approximately the

106 bases

cDNA sequence

formed

COMMUNICATIONS

From human liver by using the 17-mer

cDNA

result

7 and 15 min at 37OC, Sl nuclease

6).

genomic

of H-protein

shows

human liver

C in the

5' untranslated

2 also

shorter

with

mRNA

was also

RESEARCH

replaced

H-Protein

4-20

an identical

no fragment for

cDNA is

long

that

annealed

the

A is

Region

with

protein

G in

BIOPHYSICAL

Untranslated

complementary suggesting

AND

The 428th

The 771th

5'

product

BIOCHEMICAL

most

Vol.

178,

No.

sequence

3, 1991

BIOCHEMICAL

reported

previously

AND

BIOPHYSICAL

suggesting

(5),

RESEARCH

the

occurrence

exon for the 74-bases long exon was tentatively The most 5'

region 5' untranslated named A, and subsequent

Structural

Homology

of

and

assignment

of the

currently

first

gene, the above chicken H-protein located

genomic 49th

chicken

gene,

formed

boundaries

sizes

the

is

of introns

protein

genes,

the

genomic

the

true

H-protein

in both

are quite region

genes.

different

analyzed

gene

in

the

in the

H-protein

are

Moreover,

From this

H-

(5,12).

unknown

structures.

the

chicken

residues still

human and chicken

primary

for are

Ser for

acid

A which

valid

H-protein

reported H-protein

40th

of 125 amino

in their

positions

the for

the

5' end of exon

true

A and B, C and D, and D and E produced

exons

nucleotide

although codes

comprised for

that

mature

and at the

are

of exons

to the

with

of the

presumable

For the

Genes

region

was compared

human H-protein

positions

between

at comparable

genomic

of the

in H-protein mRNA. exons B to E.

H-Protein

Amino-termini

3, except

at identical

boundaries

(12).

proteins

in Fig.

Chicken

characterized

Ser for

and both

As shown

Human

organization

gene

at the

protein,

The

COMMUNICATIONS

the

split

codons

conservativeness,

in the

chicken

present

study

and human Hdoubtlessly

in human. DISCUSSION

In a span of about entire

H-protein

of five is

Only

exons.

closely

similar

Remaining

cDNA (Koyata

the

as basal

component transcription

that

is

appeared

to contribute activity

that

because this

is is

liver,

specified

of the

gene

chicken

glycine

respects chicken

(1,5,6,8). In the and human H-proteins

similarity could

further be regulated

cleavage

supports

system

occurs

in

the

glycine

locus

at present

by conserved

idea

that

The chicken

mechanisms 1076

cloned. the

Both to resemble

H-protein

glycine

H-protein However,

organization

of

the human and in many

of the conserved. gene

the gene

has

of the

(12).

reported

the

coordination

has been

present study, the organization has been shown to be highly the

exhibiting

decarboxylase

extents

(13).

of cDNAs now available,

have been

tissues

This

tissue-specific and brain

region

manners

The tissue-specific

with

genomic

length 5'

among multiple

by two different

tissue-specific.

by a single at the

to H-protein

AHHGlO2 and XHHG9 inserts

expressed

transcription

kidney,

insufficient

unknown

is

and coordinates to determining

in

in the

that

by us (12).

similar

Therefore,

transcription.

exclusively

structure

reported

the

comprised

gene.

gene

gene

activity

cleavage gene

H-protein

exon

gene

sequences

included

H-protein

H-protein

cleavage

H-protein

encoded

structure

in the

observation).

region

true

the

substituted

chicken

clones

exon/intron

processed

and tissue-specific

of the

glycine

were

unpublished

to the

human genomic

an organized

of the

genomic

chicken,

in

included

and Hiraga,

can be assigned such

two of several

bases

isolated

copies,

In the

four

to that

clones

genomic

13 kb,

cDNA sequence

genes This

transcription

in human and chicken.

In this

for

Vol. 178, No. 3, 1991

context,

Sakakibara

decarboxylase exon

(7).

their

5'

Further flanking

and 1.3

from

both

the

probes

in the

that

(56 bp)

for

the

of the

the

human glycine

as the presumable

unknown

glycine

structure

first

of the

decarboxylase

gene

true

and

structures

Implication remains

which are 12, -Sac1 fragments, with either or both of 24L and 24.9,

gene.

In patients

Sac1 fragment

was undetectable.

the

not

however,

genomic

is

aberrations

existing of the

from

with

nonketotic

24s or the

5.5-kb

The 5.2-kb

of patients

fragments,

display

three

hybridized

5.0-kb

genomes

that

hyperglycinemia

required

H-protein

gene, suggesting abnormal

sequence

as that

and are

true

None of these

sequence.

is

RESEARCH COMMUNICATIONS

indicated

RESULTS section,

either

24L was absent

(5,6).

the

the

hyperglycinemia, fragment

previously

a short

as well

kb in size in

AND BIOPHYSICAL

sequence. in

involved

have

analysis

gene

As detailed 7.8,

--et al. includes

gene

human H-protein

are

BIOCHEMICAL

probably

aberrations

Sac1 from

-EcoRI fragment having the 5.5-kb -Sac1 fragment included in the true H-protein observed

in the

near

presumable

in

the

pathogenesis

previous

study

processed

of nonketotic

to be examined.

Acknowledgment:

The authors

encouragements

and continuous

are

grateful

supports

to Dr. to our

Eiji

Tsukamoto

for

his

kind

work.

REFERENCES

1. Hiraga, K., Kochi, H., Hayasaka, K., Kikuchi, G., and Nyhan, W.L. (1981) J. Clin. Invest. 68, 525-534 2. Motokawa, Y., and Kikuchi, G. (1974) Arch. Biochem. Biophys. 164, 624-633 3. Hiraga, K., and Kikuchi, G. (1980) J. Biol. Chem. 255, 11664-11670 4. Hiraga, K., and Kikuchi, G. (1980) J. Biol. Chem. 255, 11671~11676 5. Koyata, H., and Hiraga, K. (1991) Am. J. Hum. Genet. 48, 351-361 6. Hiraga, K., Koyata, H., Sakakibara, T., Ishiguro, Y., and Matsui, C. (1991) Mol. Biol. Med. in press T., Koyata, H., Ishiguro, Y., Kure, S., Kume, A., Tada, K., 7. Sakakibara, Biophys. Res. Commun. 173, 801-806 and Hiraga, K. (1990) Biochem. H., Sakakibara, T., Ishiguro, Y., Kure, S., 8. Kume, A., Koyata, and Hiraga, K. (1991) J. Biol. Chem. 266, 3323-3329 F., Nicklen, S., and Coulson, A.R. (1977) Pro. Natl. Acad. Sci. 9. Sanger, USA 74, 5463-5467 S. Nisimura, S., and Seela, F. (1986) Nucleic Acids Res. 14, 10. Mizusawa, 1319-1324 F.J., Britten, R.J., and Davidson, E-H. (1987) Methods Ensymol. 11. Calzone, 152, 611-632 H., Matsui, C., and Hiraga, K. (1991) J. Biol. Chem. 12. Yamamoto, M., Koyata, 266, 3317-3322 H.. Kume, A., Ishiguro, Y., and Hiraga, K. (1991) 13. Kure, S., Koyata, J. Biol. Chem. 266, 3330-3334

1077

Partial structure of the human H-protein gene.

In a span of approximately 13 kb, two genomic fragments showing an obvious overlap encoded the 1,192 base pair (bp) cDNA sequence for human H-protein...
603KB Sizes 0 Downloads 0 Views