J. theor. Biol. (1991) 153, 181-194

A Stochastic Model for Gene Induction MINORU S. H. K o

Furusawa MorphoGene Project, Exploratory Research for Advanced Technology ( ERA TO ), Research Development Corporation of Japan ( JRDC ), 5-9-6 Tohkohdai, Tsukuba 300-26, Japan (Received on 12 October 1990, Accepted in revised form on 22 March 1991) Expression levels of individual copies of an inducible gene have been presumed to be identical to the averaged level of many copies and to change in a smooth and predictable way according to the concentration of an inducing molecule. However, our recent experiments using a steroid-inducible system showed that the expression levels of individual copies are very heterogeneous and do not necessarily coincide with the averaged expression level of many copies (Ko et aL, 1990, EMBO J. 9, 2835-2842). To explain this result, I present a stochastic model for gene induction here and its analysis using computer simulation. Stochasticity in the model is derived from the randomness corresponding to the random timing of molecular collisions and dissociations between transcription factors and a gene copy, since at any instant each copy is thought to be either "switched on" by having a transcription complex bound to it, or "switched off' by not having a transcription complex bound. This model can produce two types of gene induction that depend on the stability of the transcription complex on the regulatory region of the gene. An unstable transcription complex causes a homogeneous level of gene induction among individual copies, while a stable transcription complex causes a heterogeneous level. Since the recent consensus formed by in vitro transcription experiments is that the transcription complex is generally very stable, the latter case (the non-deterministic one) is highly possible. Since typical eukaryotic cells have just two copies for any gene in a single cell, this possibility of heterogeneous gene induction indicates that the phenotypes of individual cells cannot be precisely determined by just environmental signals, such as inducers. This may prompt us to reconsider many problems related to gene induction, including morphogenesis.

1. Introduction Induction or repression of a battery of genes by environmental signals play a central role in determining the fate of cell differentiation. For a typical eukaryote inducible gene, the expression level of an individual copy is thought to change in a smooth and predictable way according to the concentration of an inducing molecule (say a steroid h o r m o n e or a retinoid). This is a general basis for theories related to gene induction and for experiments using inducible gene expression systems. Since this notion is extrapolated from results obtained by measuring the averaged activity o f many gene copies, it is essential to know whether the expression levels o f the individual gene copies are really determined in the concentration-dependent manner of inducers. Accordingly, development of both theoretical and experimental strategies to analyze the behavior of individual gene copies becomes essential. 181

0022-5193/91/220181 + 14 $03.00/0

© 1991 Academic Press Limited

182

M.S.H.

KO

Experimentally, we previously reported such attempts using a glucocorticoid-inducible gene expression system (Ko et al., 1990). In this paper, a theoretical approach to this problem is presented. Recent studies of gene expression regulation have elucidated many intriguing mechanisms and serve to illustrate how genes are switched on and off (for reviews see Johnson & McKnight, 1989; Saltzman & Weinmann, 1989; Ptashne & Gann, 1990). In brief, the cis-acting transcriptional regulatory region, or the enhancer and promoter regions, are located upstream of the transcribed DNA sequences and are specific binding domains for the trans-acting factors or the transcription factors. These proteins bind to the enhancer/promoter region and form a transcription complex, presumably by protein-protein interactions. This complex is then recognized by RNA polymerase and transcription begins. Thus, at any instant each gene copy is either "switched on" by having a transcription complex bound to it, or "switched off' by not having a transcription complex bound. If we were to measure directly the transcription rate of each gene copy, we should find it fluctuating randomly between two quantized levels---the randomness corresponding to the random timing of molecular collisions and dissociations and the quantized levels corresponding to the fact that we are in effect observing the association of a single molecule with its ligand-activated transcription factor. Based on this concept, I have built a stochastic model and analyzed it in a quantitative fashion by computing the expression levels of individual gene copies. I have found two types of gene induction that depend on the half-life of the " o n " state of the gene and, therefore, on the extent of transcription complex stability. During the gene induction, one generally observes not the instantaneous rate of gene transcription, but the quantity of gene product accumulated over some characteristic time interval. When the accumulation time is long compared with the half-life of the " o n " state of the gene, the quantity of gene product represents a time-average of the state of the gene, and varies in a more or less smooth and predictable way with the concentration of the inducer. I call this a homogeneous or deterministic gene induction type. In contrast, when the accumulation time is relatively short, or equivalently when the half-life of the " o n " state is long, we should observe random fluctuations. I call this a heterogeneous or non-deterministic gene induction type. To the best o f my knowledge, there is no clear example for the former case, in which a homogeneous and concentration-dependent induction occurs at the level of a single gene copy. However, the existence of heterogeneous, or non-deterministic, gene induction was supported by our recent results in which the induction levels of individual gene copies were shown to be heterogeneous in a steroid hormoneinducible gene expression system (Ko et al., 1990). In the last part of this paper, the problems ofgene induction prompted by previous and current results, especially as they relate to morphogenesis are discussed. 2. The Model

For simplicity, the model is limited to consideration of genes transcribed by RNA polymerase II, although essentially the same argument can be applied to the

STOCHASTIC

MODEL

FOR

GENE

183

INDUCTION

transcription systems of RNA polymerase I and III. Recent in vitro transcription experiments show that the assembly of some trans-acting factors containing TATAbinding proteins on the enhancer/promoter region of the target gene switches on transcription by RNA polymerase II (for reviews, see Lillie & Green, 1989; Johnson & McKnight, 1989; Saltzman & Weinmann, 1989; Ptashne & Gann, 1990). At least part of this complex probably remains bound to the promoter region throughout multiple rounds of transcription (Hawley & Roeder, 1987; Van Dyke et aL, 1989). Figure 1 shows a simplified illustration of this process. Although usually a single transcribed region has several cis-elements including a TATA-box, a CAAT-box and enhancer sequences, the simplified model gene of Fig. 1 has only two such regulatory'sequences (RSI, RS2). The system begins when a defined concentration of inducer is added. By binding an inducer molecule, an inactive trans-acting factor (designated as trans-acting factor 1) molecule is transformed into an active transacting factor 1 molecule (step a). The concentration of the active trans-acting factor 1 is proportional to the concentration of the inducer. In this model, activation of the trans-acting factor 1 is brought about by the inducer, although other mechanisms are also possible such as phosphorylation or de novo synthesis. In the next step, the active trans-acting factor 1 binds to RSI (step b). This is the rate-limiting step for the overall process. A second trans-acting factor (designated as trans-acting !

m

m

RSI

RS~ transcribed region

I

V

Inducer Inactive

~trans-ocfing ~

Active

Irons-actingfactor I

(~RNA

o

'

,[°

.

.

.

.

.

factor I

polymeroseT]

Activation process of trons-acting factor I

V

e

FIG. 1. A schematic representation o f gene induction.

184

M. S, H. K O

factor 2) then binds to RS2 (step c). This process may be facilitated by pre-existing bound trans-acting factor 1 through protein-protein interactions. RNA polymerase II recognizes this transcription complex comprised of the trans-acting factors bound to the enhancer/promoter (step d), and RNA synthesis begins (step e). While the transcription complex remains bound to the enhancer/promoter region (step f), RNA polymerase II repeatedly recognizes this transcription complex (step d) and RNA is synthesized (step e). Once the transcription complex dissociates from the enhancer/promoter sequences, the whole process is initialized (step g). There are several essential features required for the success of this simplified model: (i) the concentration of the active trans-acting factor 1 is proportionally determined by the concentration of the inducer; (ii) binding of one trans-acting factor activated by the inducer, whether the first one or the second one, to the enhancer/promoter region is a rate-limiting step for whole process; (iii) when the transcription complex remains on the enhancer/promoter region, RNA polymerase II can recognize it and initiate multiple rounds of transcription; and (iv) during the existence of the transcript ion complex, the transcription rate by RNA polymerase II is constant. The first point (i) is valid under ordinary conditions where many molecules of inducer and trans-acting factor 1 are present. The second point (ii) seems to be a consensus. In inducible gene expression systems, trans-acting factors that already exist in a cell, yet are somehow masked in their activating function, are known to be rate-limiting for formation of the transcription complex on the enhancer/promoter region (for review see Johnson & McKnight, 1989). The third point (iii) is established by in vitro transcription experiments (Hawley & Roeder, 1987; Van Dyke et al., 1989). The fourth point (iv) is a reasonable assumption, if the concentration of RNA polymerase II and other accessory factors for transcription is sufficient. To simulate the behavior of individual gene copies, we have to trace every step with temporal sequences of'individuals. Figure 2 shows the transition diagram for a single gene molecule based on the above model. There are two states (A and B) of the gene in this system, because it is supposed that the binding of trans-acting Stort

Pa

f -PI

I-Pa

Pt

FIG. 2. Transitiondiagram of a single gene for computersimulation.

STOCHASTIC

MODEL

FOR

GENE

INDUCTION

185

factor 2 (step c) and the recognition by R N A polymerase II (step d) occur immediately after the binding of active trans-acting factor 1 (step b). Thus, the time for step c and d is negligible, since the step b is the rate-limiting step for the overall process. The system starts at time 0 immediately after the addition of an inducer. The concentration o f active trans-acting factor 1 is determined proportionally according to the concentration of the inducer. At every unit time, the state of the gene proceeds through an arrow o f the transition diagram. A gene copy in state A produces no transcribed R N A molecules, while a gene copy in state B produces a defined number of RNA molecules. Since collisions and dissociations o f molecules are a stochastic, these processes are properly represented by probabilities. The probability o f step b is referred to here as PI and is a function o f the number o f available active trans-acting factor 1 molecules and the affinity between active trans-acting factor 1 and RS1. This affinity is assumed to be constant through the whole process, and PI becomes a function of the number of molecules of active trans-acting factor 1. The probability of initialization (step g) is represented by P2 and is a function of the stability of the transcription complex. Since many proteins interact with each other in the transcription complex and stabilize the complex, P2 is not simply a function of the affinity between active trans-acting factor 1 and RS1. It is worth noting that the rate-limiting trans-acting factor (here the active transacting factor 1) is not necessarily involved in the transcription complex, but might just be a trigger for the formation of the transcription complex. In such a case, a trans-acting factor 3 should be included in the model for stabilizing the transcription complex by protein-protein interactions. When P1 and P2 are fixed, we can trace every transition o f g e n e copies individually. When the system starts, the computer produces a random positive number (-< 1). If that number is smaller than P~, the state of the gene proceeds to state B. In this state, R N A polymerase II produces a constant amount of transcripts per unit time (arbitrarily defined here as five molecules of R N A per unit time). For simplification, degradation of transcripts are not considered, and products accumulate. In the next step, the computer again produces a random positive number (-

A stochastic model for gene induction.

Expression levels of individual copies of an inducible gene have been presumed to be identical to the averaged level of many copies and to change in a...
779KB Sizes 0 Downloads 0 Views