I]~EVIEWS

S t u d i e s of Drosophila embryogenesis suggest that a sequential activation of a hierarchy of regulatory genes occurs during the development of multicellular organisms 1. These genes regulate the transformation of genetic information into morphological structure by orchestrating a precise temporal and spatial expression of effector genes, which in turn govern cell type specificities and eventually organ identity. Numerous developmental control genes have been isolated from Drosophila and Caenorhabditis elegans by genetic means. A similar approach has not been possible in vertebrates since not epough developmental mutants are available and they are hard to generate. Instead, a variety of mouse genes have been identified from their sequence similarity to Drosophila regulatory genes. Many of these, such as the Hox and Pax genes, are probably involved in pattern formation during or after gastmlation 23. An alternative approach to isolating regulatory genes is to use promoter and enhancer elements to screen for trans-activating factors playing a role in developmental processes. By this method a novel family of transcription factors, the POU family, was discovered; certain members of this family are expressed at pregastrulation stages. The POU family was defined by the sequence homology of three mammalian transcription factors and one nematode regulatory protein (Pit-1/GHF-1, Oct-l, Oct-2 and Unc-86) 4. Subsequently, several other members of this family have been identified and characterized. These proteins have two con~rved domains: a homeodomain distantly related to the prototype Antennapedia homeoo domain 5 and, nearby on the amino-temdnal side, a domain designated the POUspecific domain. The regions outside these two domains are highly divergent and contain domains required for transcriptional activation. Oct-1

Octamer-binding proteins and the POU

family

The octamer motif (ATGCAAAT) is a DNA sequence required for both ubiquitous and tissue-specific expression of various genes 6. It was first identified in the promoters of the histone H2B and the light and heavy chain immunoglobulin genes. The apparent paradox that the same element is required both for ubiquitous and B-cell-specific gene expression was resolved when two different proteins interacting with this sequence were characterized and cloned. Oct-1 was present in all cell types tested, while Oct-2 was detected only in B lymphocytes. Subsequently,

Octamania: The POU factors in murine development HANS R. SCH()LER Much effort has been directed towards the investigation of regulatory processes in the earlg mouse embryo. Several multigenefamilies of developmental control genes have been identified The POUfamily is a group of related transcription factors containing a particular type of bipartite DNA-bindingdomaim Members of this farai~ show distinct expression patterns during embryonic development. Two members, Oct.4 and Oct-6 are expressed as early as in the preimplantation embryo and thus may regulate early events of murine development several additional proteins that bind to the octamer motif have been identified at various developmental stages and in a variety of organs and tissues of the mouse (Fig. 1). Those proteins characterized so far all belong to the POU family (Table 1). cDNAs corresponding to several other POU genes have been cloned 7 but the proteins encoded by them have not been tested for DNA binding.

Structure of the POU region o~

~

Oct-1

NOct-2 i NOct-3 Oct-6

~

Oct-6

Oct-7

~

Oct-4A Oct-4B Oct-5

Oct-8~ ~ Oct-9- I Oct-10"'" MiniOct-2

The conserved POU-specific and POU homeodomains are 81 and 60 amino acids, respectively, and are separated by 14-26 variable amino acids. In the POUspecific domain, clusters of acidic amino acids are located near both ends. The POU-specific domain contains 28 invariant residues clustered in two subdomains of particularly high sequence homology (Fig. 2A), Each subdomain has two features in common: a cluster of basic amino acids in the center and a predicted (x-helix at the carboxy-terminal end. The homeodomain is a DNA-binding domain that plays a central role in eukaryotic gene regulation 5. It contains three well-defined helices, with helices 2 and 3 forming a helix-turn-helix motif similar to that in a number of prokaryotic DNA-binding proteins 8. However, the homeodomain helices are longer, and binding to DNA differs in several respects9. In the prokaryotic proteinDNA complex the second helix of the helix-turn-helix motif, often referred to as the recognition helix, makes critical

HcO

Survey of murine octamer-binding proteins with the electrophoretic mobility shift assay. Extracts of newborn mouse cerebellum and embryonic stem cells (D3) were incubated with a DNA fragment containing the octamer motif and applied to polyacrylamide gel electrophoresis. The ge] was dried and exposed to an X-ray film. These ~-o sources represent all known octamer-binding proteins. All except Oct-4and Oct 5 appear to be expressed in brain. The proteins were numbered according to their mobility, which reflects the size, charge and shape of the protein-DNA complexes. TIG OCTOBER ~ {1991 I!lst~ iur Htiun~c Publi>her~ Lid I I "KI o t 6 8

9 t ~ 9 91 NI}2 !)(I

1991 vOL 7 No. 10 I

I

f~EVIEWS

TAmE 1. Properties o f routine Oct factors Ad~tiomd

Oct

Oct-I

Oct-2

of prot,~.

~

Hllnlan: OTF-t; NF-A1; NF-III; OBP 100

743

Oct-1

Hum,aft:

Oct-2A: 451 463 Oct-2B: 583

OTF-2; NF-A2

N-Oct2~ N-Oct-2

Fa,l,no

Adult

~t~tts

Ubiquitous

Ubiquitous

ATATGATAATGA 6, 24 TAATGARAT CTCATGA (heptamer)

Neural tube; in entire brain except telencephalon

Lymphoid cells; nervous system; intestine; testis; kidney

CTCATGA (heptamer) 6,24,28,42

Nervous system

Nervous system (astrocytes, certain glioblastoma and neuroblastoma cell lines)

Nervous system

Nervous system; primary spermatids

(1) Oct-2

(7)

Human:

N4)ct2~;

MiniOct-2

232

Oct-2

(7)

N-Oct-3

352 324

(strong expression in developing nasal neuroepithelium)

kfs

24, 25, 28, 42

Nervous system

Nervous system (certain glioblastoma and neurobhstorna cell lines)

Totipotent and

Oocytes

TrAA.AATrc_&

Blastocyst; ES and EC cells: brain

Nervous system: testis

TAATGARAT 21,24-27 d TFAAAATrCA CTCATGA (heptamer)

24,25,42

17-21,23,24

Oct-4A Oct4B Oct-5

NF-A3;Oct-3

Oct-6

Rat: Tst-1; SCIP

Oct-7

Human: N-Oct4

Nervous system

Nervous system

24, 25, 42

Oc,-8 Oct-9 Oct-10

Human:

Nervous system

Nervous system (astroo/tes)

24, 25, 42

Oct4

(17)

pturipotent stem cells

of the pregastrulation embryo, embryonic ectoderm, primordial germ cells; testis, ovary 448 449

Oct45 (4)

N-Oct5a N4k't5b

aNumber of amino acids, determined from cDNA. ~ m m x ~ m e location in parentheses. cA. Stoykova and P. Gruss, pets. commun.

OH. RohdewoMd and P. Gruss, pers. commun.

contacts via residues near its amino-terminal end. In contrast, such residues in the homeodomain of the Drosophila engrailed protein are near the center of the recognition helix9. POU homeodomains contain 22 invariant residues, ten of which are identical to those in the 'classical' Antennapedia homeodomain (Fig. 2B). Clusters of basic amino acids are present at the amino- and carboxy-terminal boundaries. POU homeodomains probably also contain three ~-helices 9,1°. The recognition helix is the most conserved part of the POU homeodomain, containing 11 invariant residues. The RVWFCN motif therein harbors a cysteine residue, at position nine of the helix, that is specific for the POU family, whereas the other residues are extremely well conserved in all homeodomain proteins. A serine residue at the same position is typical of the pairedtype homeodomain, whereas the corresponding residue in most other homeodomains is glutamine. When

the cysteine residue in the recognition helix of Pit-1 is changed to a glutamine, specificity and affinity for the Pit-l-binding site are unaffected, indicating that other amino acid residues outside this region are critical for recognition n. Thus, for the POU family the term 'recognition helix' is not entirely valid for helix 3. Pairwise comparison of all POU homeodomain and POU subdomain sequences divides the POU family into five classes (Fig. 2C)7. This classification is supported by the similarity of the linker sequence between the POU-specific domain and POU homeodomain among members of classes II and III. The rat genes Brn-1 and Brn-2 probably encode proteins that bind to the octamer motif. Both are widely expressed in the developing and adult brain and are probably rat homologs of mouse Oct proteins. They are grouped in class III, with a recognition helix identical to that of Oct-6 and a POU-specific domain that differs at only one position 7.

WiG OCTOBER 1991 VOL. 7 NO. 10

]]REVIEWS

A

POU-SPECIFIC DOMAIN

I

~o I i

EE-S

SUBDOMAINA ] 5o SUBDOMAINB 8° ] 2O 30 40 60 70 ALG vL . . . . v~S Q srZl sCR F E * LQ LSFNNK AvAMCK E LEQFAK LF KvQ RKR I zKLG TQ*N° ~lt~..~ L TMa |K-L.~RI~I L SK"'LE - A D ¥ ¥t:TE vN

a ~- . . . . .

GE>

(

.o,,

0

B

C

class I

POU HOMEODOMAIN KRKK S RRKRRRT TIE

(

K VRA LE

F

-Kps

Pit-1 Oct-l, Oct-2

LE

R

K

QE]T IA ----L-MK KEV v RVW.,,FCN ~:l.Q . E K R -

.o,,x,

F

class III

Oct-6, Brn-1, Bin-2, Cfla, Ceh-6

c,ass,v

Brn-3, Unc-86

class V

Oct-4

FIG~ Consensus sequence of the POU region and highlighted structural properties. The POU region consists of two well-conserved domains: the POU-specific domain (A) and POU homeodomain (B), connected by a variable linker sequence of 14-26 amino acids. The POU-specffic domain contains subdomains A and B, two regions of particularly high homology. Each subdomain has a predicted (x-helix at the carboxy-terminal boundary. The three helices indicated in the POU homeodomain correspond to those established for Antennapedia and engrailed homeodomains; positions identical to the 'classical' Antennapedia homeodomain are underlined with thick bars. Large letters represent residues found in all known POU proteins; two small letters represent two possible amino acids at that position, the upper letter representing the one found more frequently; a single small letter represents the predominant amino acid when more than two are found at that position. Hyphens indicate variable positions. Clusters of acidic (+) and basic (-) amino acid residues are indicated. According to the homology between the POU domains and POU subdomains, POU proteins can be subdivided into five classes (C). The classification holds for the linker sequences of classes II and III, but not for class IV.

Function of the POU region Whereas an intact homeodomain is required for DNA binding of all POU proteins, the contribution of the POU-specific domain varies, depending both on the DNA-binding site and on the POU protein. The POU homeodomain of Pit-1 is sufficient for low-affinity binding to AT-rich DNA sequences, but addition of the POU-specific domain increases both the affinity and specificity of binding to DNA1L The recognition helix and the predicted helix A must be intact for efficient DNA binding, whereas (in contrast to the engrailed homeodomain9) the other helices play only a minor role. The contribution of the Oct-1 h o m e o d o m a i n and POU-specific domain to DNA binding has been studied in detail for the Ad2 octamer motif 12,13. The POU-specific domain and the homeodomain each make half of the base contacts with the octamer sequence, whereas the majority of the backbone contacts are made by the homeodomain. Since base contacts determine the sequence specificity of a DNA-binding protein, while the binding energy stems mainly from electrostatic interactions between the protein and the DNA backbone 14, the h o m e o d o m a i n of Oct-1 contributes most of the binding energy, whereas the POUspecific domain provides half of the specificity for the octamer motif. The Oct-1 homeodomain binds with significantly lower affinity to the octamer motif than does the complete POU region. However, with other Oct-lbinding sites (Table 1), addition of the POU-specific domain of Oct-1 has no effect or can decrease affinity 12. The POU-specific domain is also required for protein-protein interactions 11, which can confer additional regulatory specificity. An inhibitory effect of one POU factor on another has recently been described for two

Drosophila proteins, I-POU and Cfl-a (Ref. 15). Cfl-a is monomeric in solution but dimeric when it binds to the Cfl-a regulatory site in the dopa decarboxylase gene. I-POU, which is closely related to Unc-86, lacks two of the five basic residues that are present in the amino-terminal portion of other POU homeodomains. Since these residues are critical for binding to DNA, I-POU alone remains as a m o n o m e r in solution. Coexpression of both factors leads to stable heterodimers, resulting in inhibition of DNA binding and suppression of transcription. Both POU regions are presumably required for dimerization, although this has only been shown for the POU region of Cfl-a. The POU-specific domain does not contain a known DNA-binding structure, and attempts to show binding of an isolated POU-specific domain to certain DNA motifs have failed. However, this does not exclude the possibility that the POU-specific domain can bind by itself to some other DNA motif. In this respect, it may be helpful to compare POU proteins with members of the Pax family 16. The paired domain of these proteins is associated with a paired-type home0domain in certain Drosophila and mouse proteins, perhaps constituting a domain analogous to the POU region (Fig. 3). Interestingly, several Pax proteins lacking a homeodomain bind via their paired domain to specific DNA sequences that are distinct from homeodomain-binding sites (G. Chalepakis and P. Gruss, pers. commun.). Perhaps homeodomain-less POU proteins also exist, although none has so far been reported.

Expression of POU factors in mouse development Most if not all POU genes are differentially expressed throughout embryogenesis. However, unlike

"rig OCTOBER'1991 vot. 7 NO. 10

m

~'~EVIEWS mutations prevent the differentiation of mother cells into daughter cells: HOMEODOMAIN [ SPECIFICDOMAIN one of the two daughter cells of variable 60 I prototypestructure a division fails to assume its normal fate and instead propagates the phenotype of the mother cell. Thus PAIREDHOMEODOMAIN PAIREDDOMAIN Unc-86 represents an important com128 I 60 ~--- Pax-3;Pax-6; ponent of a mechanism linking- cell variable Pax-7 identity to cell lineage30. PAIREDDOMAIN The expression patterns of some Pax-1; Pax-2; 128 POU genes are increased in Pax-8 complexity by differential splicing. Oct-2 splicing variants are found in POU-SPEC.DOMAIN POU HOMEODOMAIN distinct regions of the murine nerOct--factors Pit-1 vous systemzS. One of these transcripts encodes a protein consisting POU-SPEC. DOMAIN homeodornain-less almost entirely of the Oct-2 POU POU protein ? domain, hence its name MiniOct-2 (A. Stoykova and P. Gruss, pers, bTGIR commun.). During development, Comparison of two different families containing a homeodomain and a specific MiniOct-2 RNA expression is very high in the olfactory neuroepithelium. domain, In the POU and in the Pax families (U. Deutsch, pers. commun, and Ref. 16) the homeodomain is linked via a variable sequence to a domain specific for In adult brain it is found in the mitral each family. Certain members of the Pax familylack the homeodomain, whereas cell layer of the olfactory bulb. The homeodomain-less POU proteins have not been described. Examples for each olfactory system, including the mitral combination of homeodomain and specific domain are presented. cells, is the only region in the mammalian nervous system where growth the Hox and Pax gene families, no unifying regularity and differentiation of sensory neurons continue throughout adult life31. As a result, the olfactory of their expression patterns is obvious. Oct-4 (also called Oct-3 and NF-A3) and Oct-6 are elements are constantly reinervated and are kept in a expressed in early embryogenesis 17-25. Oct-4 is found proliferative state. The Pit-1/GHF-1 gene is transiently expressed in in the totipotent and pluripotent stem cells of the pregastrulation embryo and is downregulated during dif- the neural tube and later reappears in the pituitary. ferentiation of these cells, eventually becoming con- The development of this organ results in the temfined to the germ-line lineage. The expression pattern porally precise appearance of five distinct cell types of Oct-4 (detailed in the legend to Fig. 4) thus seems that are derived from a common lineage 32. In the to correlate with an undifferentiated cell phenotype, In mouse, Pit-I/GHF-1 transcripts are detected within embryonic stem (ES) and embryonal carcinoma (EC) 24 h of the first observable events in anterior pituitary cell lines, both Oct-4 and Oct-6 are downregulated differentiation (13.5 p.c.). Pit-1/GHF-1 protein is not when the cells are induced to differentiate by retinoic detected until about three days later and correlates with the onset of growth hormone and prolactin acid. In addition to its expression at early embryonic expression 29. Therefore, Pit-1/GHF-1 expression stages, Oct-6 is found in specific neurons of the devel- appears to be controlled at both the transcriptional oping and adult brain and also in testis. SCIP/tst-1, the and post-transcriptional levels. Snell dwarf (dw) and rat homolog of Oct-6, is expressed postnatally by devel- dwarf Jackson (dw.l) mice, which contain disrupted oping Schwann cells of the peripheral nervous system. Pit-1/GHF-1 genes, lack growth hormone, prolactin Because SCIP/tst-1/Oct-6 is only transiently expressed and thyroid-stimulating hormone, and show hypoplasia during the period of rapid cell division separating the of the target cell types of these hormones33. The premyelinating and myelinating phases of Schwann Pit-1/GHF-1 protein of Snell dwarf mice cannot cell differentiation, it might play a role in the progress- bind to its target sequence because the tryptophan ive determination of these cells2

Octamania: the POU factors in murine development.

Much effort has been directed towards the investigation of regulatory processes in the early mouse embryo. Several multigene families of developmental...
912KB Sizes 0 Downloads 0 Views