Proc. Natl. Acad. Sci. USA Vol. 89, pp. 38-42, January 1992 Medical Sciences

Molecular characterization of NSCL, a gene encoding a helix-loophelix protein expressed in the developing nervous system (hemopoiesis/neurogenesis/transcription factor/genetic linkage)

C. GLENN BEGLEY*t, STAN LIPKOWITZf, VERENA GOBEL§, KATHLEEN A. MAHON¶, VIRGINIA BERTNESSt, ANTHONY R. GREEN*, NICHOLAS M. GOUGH*, AND ILAN R. KIRSCHt *The Walter and Eliza Hall Institute of Medical Research and tDepartment of Diagnostic Haematology, Royal Melbourne Hospital, Melbourne, Victoria 3050, Australia; and tThe Navy Medical Oncology Branch and §The Metabolism Branch, National Cancer Institute, and 1The Laboratory of Mammalian Genes and Development, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20814

Communicated by Donald Metcalf, September 17, 1991 (received for review August 14, 1991)

We report here the molecular cloning and ABSTRACT chromosomal localization of an additional member of the helix-oop-helix (HLH) family of transcription factors, NSCL. The NSCL gene was identified based on its hybridization to the previously described hemopoietic HLH gene, SCL. Murine NSCL cDNA clones were obtained from a day 11.5 mouse embryo cDNA library. The coding region is 399 base pairs and encodes a predicted protein of 14.8 kDa. The nucleotide sequence shows 71% identity and the amino acid sequence shows 61% identity to murine SCL in the HLH domain. The NSCL protein-coding region terminates six amino acids beyond the second amphipathic helix of the HLH domain. Expression of NSCL was detected in RNA from mouse embryos between 9.5 and 14.5 days postcoitus, with maximum levels of expression at 10.5-12 days. Examination of 12- and 13-day mouse embryos by in situ hybridization revealed expression of NSCL in the developing nervous system. The NSCL gene was mapped to murine chromosome 1. The very restricted pattern of NSCL expression suggests an important role for this HLH protein in neurological development.

MATERIALS AND METHODS Southern and Northern Blots. DNA and RNA blot hybridization analyses were performed by standard techniques. Genomic DNA (10 ,ug) was digested with restriction endonucleases, electrophoresed in 0.8% agarose gels, and transferred to nitrocellulose. Filters were prehybridized and hybridized in a hybridization mix containing either 6x or 2x SSC ( x SSC = 0.15 M NaCl/0.015 M sodium citrate, pH 7) (14). Filters were washed in 6x, 2x, or 0.2x SSC containing 0.1% NaDodSO4 as indicated. Northern blots were performed with 2 ,ug of poly(A)+ mRNA or 10 tug of total mRNA size fractionated on 1% formaldehyde agarose gels as described (15). Library Screening. A human genomic SCL probe (1.OXX) (16), encompassing the HLH region, was initially used to screen a murine genomic library at low stringency (washing at 450C in 0.1 x SSC/0. 1% NaDodSO4). An additional murine genomic library (Clontech no. ML 1030j) was subsequently screened using murine NSCL probes (washing at 650C in 0.2x SSC/0.1% NaDodSO4). A murine cDNA library constructed from 11.5-day mouse embryo RNA (Clontech no. ML 1027a) was screened with murine genomic NSCL probes. Plasmid subclones were generated by ligating fragments into pBluescript vector (Stratagene) and were sequenced in both directions by using the dideoxy chain-termination method with Sequenase polymerase (United States Biochemical) and oligonucleotide primers. In Situ Hybridization. Paraffin sections of 12- and 13-day C57BL/6J mouse embryos were examined as described (17). Probes were [35S]UTP-labeled antisense RNA transcripts synthesized from two different NSCL cDNA clones in pBluescript by using either T7 or T3 RNA polymerase (BRL) according to the manufacturer's instructions. One cDNA clone [1.5 kilobases (kb)] extended to the EcoRI site in the 3' untranslated region (UTR) and included the entire NSCL coding region and 5' UTR. A second cDNA clone encompassed 600 base pairs (bp) of 3' UTR sequence beginning at the EcoRI site. Both probes gave identical results. Hybridization of irrelevant sense probes showed no specific signal. Hybridized slides were exposed to Kodak NTB-2 emulsion for 10-30 days.

The helix-loop-helix (HLH) proteins are a family of putative transcription factors, some of which have been shown to play an important role in growth and development of a wide variety of tissues and species. The family includes proteins critical to pigment determination in maize, neurological development in Drosophila, the centromere binding protein of yeast, enhancer binding proteins of lymphocytes, and genes critical to muscle differentiation (1-3). These proteins are able to form homodimers or heterodimers by virtue of their HLH motif (1, 4). Many members have a domain of basic amino acids, immediately upstream of the HLH domain, that determines DNA binding specificity (5). Four members of this family have been clearly implicated in tumorigenesis via their involvement in chromosomal translocations in lymphoid tumors: MYC, LYLI, E2A, and SCL (8). We identified SCL because of its involvement in a 1;14 translocation in a stem cell leukemia (9, 10). Subsequently the involvement of SCL (also known as TAL) in translocation events has been confirmed by others (11, 12), and the frequency of SCL disruption in T-cell acute lymphoblastic leukemia is estimated at 25% (13). In this paper we report the molecular cloning of an additional HLH gene, NSCL 11, which was identified because ofits homology to SCL. This molecule is closely related to SCL in its HLH domain and is expressed in the developing nervous system (thus Neurological SCL).

RESULTS Detection of the SCL-Related Locus NSCL. While screening for murine SCL with a human SCL HLH probe under conditions of reduced hybridization stringency, one particular phage clone was identified that appeared to originate Abbreviations: HLH, helix-loop-helix; UTR, untranslated region; CNS, central nervous system. "The sequence reported in this paper has been deposited in the GenBank data base (accession no. M82874).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. 38

Medical Sciences:

Begley et al. NSCL Ly9 Sap Spnal

2q

18q

1q

FIG. 1. Genetic linkage map of mouse chromosome 1. The location of genes is taken from refs. 19 and 20. Regions of synteny with human chromosomes are indicated below (double lines). The

most likely order is Ly-9-NSCL/Sap-Spna-1 locus on distal chromosome 1.

from a different gene. The region from this clone that hybridized to the SCL HLH domain also demonstrated a unique pattern when used to analyze Southern blots of murine and human genomic DNA (data not shown). Sequence analysis showed a related but distinct HLH region when compared to SCL (V.B. and I.R.K., unpublished observations). Moreover, chromosomal localization confirmed that this gene (NSCL) was localized to chromosome 1, in contrast to SCL, which is localized to chromosome 4 (18). This localization of NSCL was determined by genetic linkage using recombinant inbred mouse strains (data not shown). Comparison of the strain distribution pattern for alleles of the NSCL gene in the AKXL, BXD, SWXL, and BXH panels with that of other murine genes mapped using these recombinant inbred strains suggested that NSCL is located on the distal portion of chromosome 1 (Fig. 1), closely linked to the Sap locus (0 of 51 recombinants) (19, 20). Cloning of a Murine NSCL cDNA. Expression of NSCL was detected in embryonic tissue (see below), and cDNA clones were therefore obtained from a day 11.5 embryo cDNA library screened by using a genomic NSCL HLH probe. A total of 29 clones containing inserts of between 0.6 and 1.5 kb were obtained from screening 1.5 x 106 recombinant plaques from the amplified library. Five overlapping classes ofNSCL cDNA clones were identified and sequenced in both directions. No clones crossed the internal EcoRI restriction endonuclease site in the 3' UTR, and analysis of genomic clones was used to confirm the sequence in this region.

Proc. Natl. Acad. Sci. USA 89 (1992) NSCL

TAKYRTAHATRERIRVEAFNLAFAELRKLLPTLPPDKKLSKIEILRLAICYISYLNHVLDV

SCL KVVRRIFTNSRERWRQQNVNGAFAELRKLIPTHPPDKKLSKNEILRLAMKYINFLAKLLND LYL-1 KVARRVFTNSRERWRQQHVNGAFAELRKLLPTHPPDRKLSKNEVLRLAMKYIGFLVRLLRD TWIST LOTQRVMANVRERQRTQSLNEAFAALRKIIPTLPSDK. LSKIQTLKLAARYIDFLYQVLQS

FIG. 3. Comparison ofthe amino acid sequence ofmurine NSCL, SCL, LYL1 (LYL-1) and twist proteins in the region of the HLH domain and upstream DNA-binding (basic) domain. Underlined residues indicate sites of consensus amino acids defining the HLH and basic domains.

A composite nucleotide sequence for murine NSCL cDNA (2458 bp) and the predicted amino acid sequence are shown in Fig. 2. A long open reading frame begins at the CTG codon at 318 bp. This CTG codon is in a poor environment with respect to the consensus sequence required for translational initiation (21) and is unlikely to represent the translational start site. Within the long open reading frame, there are three methionine residues: at nucleotides 450, 453, and 471. The first and especially the third of these residues are good candidates for the translational initiation site (21). The presence of the stop codon (position 849) was also confirmed by sequence analysis of murine and human genomic NSCL clones. Although there is a potential open reading frame of 333 bp immediately beyond position 849, utilization of this region would require a frame shift. This is unlikely based on confirmatory sequence analysis of murine and human genomic NSCL clones (data not shown) and given the predicted amino acid sequence that defines the NSCL reading frame (see below). The 3' UTR is 1.6 kb with multiple stop codons. The polyadenylylation signal sequence ATTAAA is a wellrecognized natural variant of the canonical motif (22). Twenty-two base pairs beyond this sequence the cDNA diverged from the genomic sequence and differed by the addition of a poly(A) tail, which delineated the 3' extent of the gene. The 3' UTR is relatively A+T-rich. This is particularly evident in the region commencing at 1813 bp, where the motif ATTTA occurs repeatedly. This sequence is associated with instability of mRNA and occurs in the 3' UTR of a number ofgrowth factors and oncogenes (23).

TAAGTAAGAGCATCCCTTTTGCCTAGGCCTTACAAATCCTCATCATGT GTACCCTTCTCATCTGTCCTGGATCCTGTTCAGCCACAAGCTGCCCCTAAGAGCTCTAGAiAACTGCTCACTAACTTTGCAGACAGATGACCCTTGAGCAGCGAGAAGA CCAGAGCTCAGTGGCAGACCCCGGCACTCCAGGCCCTTGCTGCTTTCCTCCTAACCCTCTTTGAAGCTCCTAAAGGGAGGGGTGGGGAGAGGATCCGCATTTCCCTAG Met Met Leu Asn Ser Asp Thr Met Glu Leu Asp Leu Pro Pro Thr His Ser Glu Thr Glu Ser Gly GTGGGGGGTTTTCCACC ATG ATG CTC AAC TCC GAT ACC ATG GAG CTG GAC CTG CCT CCC ACC CAC TCG GAG ACC GAG TCG GGC Phe Ser Asp Cys Gly Gly Gly Pro Gly Pro Asp Gly Ala Gly Ser Gly Asp Pro Gly Val Val Gln Val Arg Ser Ser Glu TTT AGC GAC TGT GGG GGC GGA CCG GGC CCC GAT GGT GCT GGA TCC GGG GAT CCA GGA GTG GTC CAG GTC CGG AGC TCA GAG Leu Gly Glu Ser Gly Arg Lys Asp Leu Gln His Leu Ser Arg Glu Glu Arg Arg Arg Arg Arg Arg Ala Thr Ala Lys Tyr CTT GGA GAG TCC GGC CGC AAA GAC CTG CAG CAC TTG AGT CGT GAG GAG CGC AGG CGC CGG CGC CGC GCC ACG GCC AAG TAC Arg Thr Ala His Ala Thr Arg Glu Arg Ile Arg Val Glu Ala Phe Asn Leu Ala Phe Ala Glu Leu Arg Lys Leu Leu Pro CGC ACG GCA CAC GCC ACG CGG GAG CGC ATC CGC GTG GAA GCC TTC AAC CTA GCC TTC GCC GAG CTG CGC AAG CTG CTG CCC Thr Leu Pro Pro Asp Lys Lys Leu Ser Lys Ile Glu Ile Leu Arg Leu Ala Ile Cys Tyr Ile Ser Tyr Leu Asn His Val ACT CTG CCC CCG GAC AAG AAG CTC TCT AAG ATT GAG ATC CTA CGC CTG GCC ATC TGC TAT ATC TCC TAC CTG AAC CAT GTG Leu Asp Val *** CTG GAC GTC TGA ACTCAGCCCGCATCCCACCCTGGATTTTCCCCATTTCCCTGGGCCTCTCCAGAGCCCCCTGTCACCATACATTACTTAGAATGGCCGGCCC CCTCCCATCCCAGAGGACCAGGCTCACATCGCTAGTCCTGAAGGCGGTTTCTTTTCATTGGCCCAGGAATGTGAAGGATGTTCTTATGGGTTCCTTCAGAGAGTTGTT

ATTATAGGAAAATTCTGCCTTTCTTTTGTCTCTCTCAGTCCTGAGGCTTAGGATATACAGACCTCAGATCTAACTTGGTAGTGAGTGCCTTGCCTTCTTGGAGCTGTC CGGCCGCTCCCGGTCCCCTCCCACCCCTGCCACCCAGAATTCTCCCTTTCCCTGTGCCGGCTAGAAAAAGGAAAACCTTAGTCCTGGGATAAGGATGACACCCCCAAA CAGCCCAGGGCACCTTGCAGAAGGCTCAGGCCCTGGGTGGGGCCGCTCCACAGCAGCTCCTTCCATCCCAGGGGACCCTTGAAAACATGCAAACCCCATCAGCTCCTC CGCCCATGCTGCTGAGCCCCACCCACCTGCCCATTGTAGCCTTGCGACCCAGAGTCTATGGGCCTTAAATTCCCTGTAGAAGAACTCACTGCTTTTCTCTGCTCCATC CATCCTCACACCAAACTCCTAGCTCTACAAGGATATTTATTTATGTATTTATTTATTCACTTATTTATTTATGTATTTATTTATTTATAAATATTACTATTTATTACC GAGTTATGCACTTTGGGGTAGGGTGAGGGGGGCTCCTTGCAGCTTGCTTTAGCTGAGGTCCTCTTGCTCTCTCCCGGGTCACTTCTCTTCTTCTTTCTCTAGTGCAAG TATGTGTGGGGATCCCTTCACCCACCATACTGGTACCCCTTCTTCAGCTCCATCTGTCCCTATCCCCTTTCCTCTGGAATAGTGTCCCTTCTCAGCCTCCCAGCTTCT AGGGGGTCCTTTCTCAATCTCTCCCTGCTGCCCCCGCCCCGACAATCCCCCTCCAGTCCTGTGTACTCCATTCCTCTTACCAGCCCTGCTTTTCCCCAAACCACCAAG AGCAGACCCTGGGATCTGTGTCTGGTGTCCTGTGTGTCTTTGTCTGGTTGCATTCCTAATTTCCTACAAAAAGAAAAATTAAAGTGACCTCGATCTAGTACCAGAAAA AAAA

39

108 216 324 432

515 596 677 758 839 942 1050

1158 1266 1374 1482 1590 1698 1806 1914 2022 2130 2238 2346 2454 2458

FIG. 2. Nucleotide and amino acid sequence for a murine NSCL cDNA. Overlapping clones from a mouse embryo cDNA library were sequenced with oligonucleotide primers, and the nucleotide sequence is shown. A long open reading frame begins at position 318. The first in-frame methionine is at position 450. The open reading frame terminates at position 849. The polyadenylylation signal is underlined.

40

oSiA.r'

Molecular characterization of NSCL, a gene encoding a helix-loop-helix protein expressed in the developing nervous system.

We report here the molecular cloning and chromosomal localization of an additional member of the helix-loop-helix (HLH) family of transcription factor...
2MB Sizes 0 Downloads 0 Views