Molecular BioSystems View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

PAPER

Cite this: DOI: 10.1039/c3mb70483a

View Journal

Steady state analysis of the genetic regulatory network incorporating underlying molecular mechanisms for anaerobic metabolism in Escherichia coli† Sumana Srinivasan‡ and Kareenhalli Viswanath Venkatesh*‡ A Gene Regulatory Network (GRN) represents complex connections between genes in a cell which interact with each other through their RNA and protein expression products, thereby determining the expression levels of mRNA and proteins required for functioning of the cell. Microarray experiments yield the log fold change in mRNA abundance and quantify the expression levels for a GRN at the genome level. While Boolean or Bayesian modeling along with expression and location data are useful in analyzing microarray data, they lack underlying mechanistic details present in GRNs. Our objective is to understand the role of molecular mechanisms in quantifying a GRN. To that effect, we analyze under steady state, the complete GRN for the central metabolic pathway during anaerobiosis in Escherichia coli. We simulate the microarray experiments using a steady state gene expression simulator (SSGES) that models molecular mechanistic details such as dimerization, multiple-site binding, auto-regulation and feedback. Given a GRN, the SSGES provided the log fold change in mRNA expression values as the output, which can be compared to data from microarray experiments. We predict the log fold changes for mutants obtained by knocking out crucial transcriptional regulators such as FNR (F), ArcA (A), IHFA-B (I) and DpiA (D) and observe a high degree of correlation with previously reported experimental data. We also predict the microarray expression values for hitherto unknown combinations of deletion mutants. We hierarchically cluster the predicted log fold change values for these mutants and postulate that E. coli has evolved from a predominantly lactate secreting (FAID mutant) into a mixed acid secreting phenotype as seen in the wild type (WT) during anaerobiosis. Upon simulating a model without

Received 31st October 2013, Accepted 22nd November 2013

incorporating the mechanistic details, not only the correlation with the experimental data reduced

DOI: 10.1039/c3mb70483a

mutant FAID. This clearly demonstrates the significance of incorporating mechanistic data while

considerably, but also the clustering of expression data indicated WT to be closer to the quadruple quantifying the expression profile of a GRN which can help in predicting the effect of a gene mutant

www.rsc.org/molecularbiosystems

and understanding the evolution of transcriptional control.

Introduction Gene Regulatory Networks (hereafter referred to as GRNs) are responsible for timely and need based expression of genes involved in various cellular processes such as cell division and reproduction,1 cell development2 and disease progression.3 Data from microarray experiments are used to quantify a GRN at a specific phenotypic state relative to a standard condition. Quantification and analysis of microarray data can provide Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India. E-mail: [email protected]; Fax: +91-22-2572 6895; Tel: +91-22-2572 7223 † Electronic supplementary information (ESI) available. See DOI: 10.1039/ c3mb70483a ‡ SS and KVV contributed equally to this work.

This journal is © The Royal Society of Chemistry 2014

insights into the operation of GRNs. Various methods have been used to analyze GRNs and the microarray data. None of the methods utilize the mechanistic details of molecular interactions inherent in the GRN. This is mainly due to lack of such information or availability of a partial interaction map for a specific GRN. Microarray experiments have played a pivotal role in evaluating GRNs at the genome level.4–6 They track the comparative mRNA abundance ratio between a condition of interest and a reference condition. For example, gene expression data for Escherichia coli upon exposure to UV irradiation provide us the fold expression level of genes during the process of DNA damage repair as compared to the normal condition.7 Upon availability of such data, researchers may follow a data mining approach (e.g. clustering, PCA),8,9 a model based approach

Mol. BioSyst.

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

(e.g. analysis based on Boolean and Bayesian approaches)10 or models combining other data such as transcription factor location with expression data.11,12 The data mining approaches provide insights into the relatedness and inferred functionality of genes whereas model based approaches provide insights into the underlying architecture and properties of GRNs. Boolean13 and Bayesian14 approaches have been useful in analyzing data for large networks. The major advantage of these models is that they do not require any parameter values but lack an explicit representation of the regulatory molecular mechanisms thus missing the operational characteristic of the GRN. Approaches using time dependent functions15 to study dynamic expression patterns have also been reported but they do not capture mechanistic details and therefore have minimum predictive capabilities. The differential equation approach16 does take into account the dynamics of the underlying mechanism of gene regulation but requires kinetic parameters which are hard to obtain. Recently, Cao and Zhao17 proposed a generalized profiling method to estimate ODE parameters in a gene regulation network from noisy microarray gene expression data using a technique called the penalized smoothing technique, a computational method that approximates ODE solutions. Stochastic approaches18 consider network dynamics and intrinsic noise while modeling the GRN. However, when the number of genes in the network is scaled, these models become computationally complex. At the level of single cell analysis, when noise is significant and cell to cell variability data are known, stochastic modeling is feasible. However, in GRNs with a large number of genes and with lack of knowledge of cell–cell variability, deterministic models have been known to provide valuable insights. A detailed review of existing techniques19 for modeling and analysis of GRN that includes a schematic comparison based on the ability to model reality faithfully, size of analyzed models, level of detail, amount of data required to be modeled etc. indicates that the choice of any given method depends on the size of the GRN (number of genes and interactions) and amount of information available. Steady state modeling is a strategy wherein mechanistic details can be incorporated, namely, protein–DNA binding, protein–protein binding, stoichiometry, multiple binding sites, cooperativity, autoregulation, feedback and translocation.20 The quantification of these mechanisms requires equilibrium dissociation constants and total effector protein concentrations. Such a methodology has been effectively used in analyzing small subnetworks of GRNs such as a l-bacteriophage,21 the tryptophan operon in E. coli,22 the lac operon in E. coli,23 the GAL regulatory system in S. cerivisae24,25 etc. Recent studies in establishing consistency between existing knowledge and microarray experiments indicate that when effector metabolites of regulatory proteins are not considered the statistical significance falls by 40%.26 Hence, there is a need for modeling the effect of regulatory proteins and the expression of those genes via important regulatory transcription factors. This will help in determining the connection between underlying regulatory interactions and the system behavior. The main challenges that arise are the complexity of interactions,

Mol. BioSyst.

Molecular BioSystems

lack of knowledge of regulatory interaction, lack of a suitable modeling strategy and supporting mathematical tools that can represent the complexity. Thus models incorporating mechanistic details have remained confined to small sized systems. The main focus of this work is therefore to build a steady state model representing detailed gene–protein and protein– protein interactions present in the central metabolic pathway of Escherichia coli under anaerobic respiration. The role of mechanistic details in the performance of GRN helps in directly linking the prediction of log fold changes in mRNA levels to microarray experiments enabling prediction of gene expression in hitherto unknown mutants. To the best of our knowledge, this is the first attempt to utilize the steady state modeling methodology to quantify a GRN at a scale of 74 plus genes and approximately 1000 interactions including those with feedback. By using this model, we ask questions like – what is the role of molecular mechanisms in quantifying the expression levels in a GRN? Can we predict the microarray data relative to the wild type for mutants generated by deleting multiple transcriptional regulators? This study has demonstrated the relevance of mechanistic details in accurately predicting microarray experiments. The methodology offers a schema to simulate existing microarray expression data for various mutants that are already characterized. Further, the methodology was used to simulate microarray data for uncharacterized mutants including double, triple and quadruple mutants of key transcriptional regulators. The study demonstrates the relevance of molecular mechanisms in predicting quantitative information regarding expression levels in a GRN with known topology. Therefore, the model can be further used to predict the expression for a mutant lacking one or more genes that are a part of the GRN. We distill interesting insights regarding the evolutionary pathway of key transcriptional regulators under anaerobic respiration in E. coli, by predicting the log fold changes in the transcription factor mutants and hierarchically clustering them into phenotypically similar clusters. The extent of gene expression in various mutants lacking key transcriptional regulators provided insights related to the evolution of the mixed acid fermentation pathway from a predominantly lactic acid secreting pathway. We validate our insights with phylogenetic studies on repeated sequences of these key transcriptional regulators. Thus, the steady state model offers a platform to generate several testable hypotheses regarding the working of a GRN.

Results Steady state modeling of anaerobic respiration in E. coli The central metabolic pathway in E. coli is well characterized comprising of the following branches namely, (i) glycolysis, (ii) pentose and ED pathways, (iii) mixed acid fermentation, (iv) the TCA cycle and (v) the glyoxylate shunt. Fig. 1 depicts various interactions and mechanisms in the GRN for all enzymatic genes regulated by the key transcriptional regulators. The metabolic reactions, respective enzymes that catalyze these reactions, the genes responsible for the production of these

This journal is © The Royal Society of Chemistry 2014

View Article Online

Paper

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

Fig. 1 The gene regulatory network for the various branches in the central metabolic pathway of E. coli – (A) glycolysis, (B) the pentose phosphate pathway, (C) mixed acid fermentation and (D) the TCA cycle and the glyoxylate shunt. The enzymes are represented by rectangles and the abbreviations are elaborated in Table S2 (see ESI†), the genes by ovals and multiple binding sites as circular projections and the transcriptional regulators as colored ovals with the beige ovals representing dimerization of transcriptional activators before binding to the DNA.

enzymes and the transcriptional regulators that regulate the genes are enlisted (see Table S1 in ESI†). The model includes information on the protein–protein and gene–protein interactions gathered from EcoCyc,27 RegulonDB28 and BiGG29 databases. The Regulon database is used to factor out the key transcriptional regulators that regulate all the genes in the network and 13 such regulators were identified (namely, fnr, arcA, dpiA, ihfAB which are active and gpsa, fruR, fis, ascg,

This journal is © The Royal Society of Chemistry 2014

nsrr, modE, rob, soxR, fadr which are inactive in the wild type under anaerobic conditions). In this study, we simulate various combination mutants of active master transcriptional regulators, FNR (F – Fumarate Nitrate Reduction – acts as a oxygen sensor), A – ArcA (Aerobic Respiration Control – represses a wide variety of aerobic enzymes under anaerobic conditions), I – IHF (Integration Host Factor – an enhancer to initiate the transcription) and D – DpiA (Destabilizer plasmid inheritance – essential

Mol. BioSyst.

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

for expression of citrate specific fermentation genes) to infer interesting insights into gene expression of important pathways in these mutants. Culling information from the databases mentioned above, we have constructed a GRN with 74 genes and 1000 interactions. The GRN consists of 19 autoregulatory reactions (inclusive of auto-activation and repression), 5 dimerization, 150 dimers binding to DNA, 100 multiple site binding interactions and the remaining interactions with transcriptional factors binding as monomers. The dissociation constants for each of these interactions and the maximum protein concentration for the genes in the GRN were input as parameters. The model parameters were obtained from literature when available and were tuned by varying specific parameters to fit the experimentally reported values for already characterized mutants. The parameters involved in the SSGES simulator are (i) total gene concentration, (ii) total protein concentration, (iii) protein–DNA binding constants and (iv) protein–protein binding constants constituting 901 equilibrium binding constants, 74 genes and their respective protein concentrations. The gene and protein concentrations were fixed from literature. Out of 901 equilibrium binding constants, 819 were either obtained from literature or assumed. In order to match the fold change prediction to microarray experiments for 74 genes, only 72 constants were tuned to fit the data. We make use of the parameter values first to predict the absolute probability of expression for the wild type strain and tune the model to obtain a match with microarray experiments for individual single mutants Darca and Dfnr. Once the model parameters are tuned, we predict the expression value for Darca–fnr and validate the prediction with microarray experiment values. The steady state model (SSGES) was then used to quantify the GRN by obtaining the fractional gene expression

Molecular BioSystems

levels for the wild type (WT) and various unknown transcription factor mutants (see Methods and ESI† for details). Quantification of the wild type using the SSGES and validating the trend in gene expression data The steady state fractional expression of all the genes in the GRN directly involved in the production of enzymes in the central metabolic pathway under anaerobic conditions for the wild type (referred to as WT) is shown in Fig. 2. The absolute expression for all the genes for a specific strain is not available through experiments to validate. As expected, we observe that the genes for glycolytic enzymes are completely expressed and the genes for the enzymes required for the conversion of pyruvate to mixed acids are expressed with high fractional values varying from 75% to 100%. Conversion of pyruvate to formate under anaerobic conditions is known to be high as predicted by our model wherein the tdce and pflb genes required for the enzyme pyruvate formate lyase is expressed fully. The gene ppc responsible for the branch point enzyme of phosphoenolpyruvate carboxylase at the PEP node is also highly expressed (75%) which enables the formation of oxaloacetate leading to succinate accumulation. The gene fumB responsible for the conversion of fumarate to malate is also expressed highly (by 80%) to enable the succinate conversion. The only enzyme amongst the mixed acids to be highly suppressed is the lactate dehydrogenase (gene ldha) which is also experimentally observed and reported (transcriptional induction of ldha is decreased five-fold in the presence of arcA-B).30 The gene expression profile reflects the observed phenotypic behavior in the wild type wherein it is a predominantly mixed acid secreting bacteria under anaerobic conditions. The key feature of SSGES is the incorporation of mechanistic details into the prediction of gene expression. To determine the

Fig. 2 Fractional expression of all genes corresponding to the enzymes that catalyze the reactions in the central metabolic pathway in the model for the wild type strain of E. coli. (A) The predictions by SSGES indicate genes responsible for glycolysis, mixed acid fermentation and PEP to oxaloacetate are expressed whereas the PPP pathway, the TCA cycle, lactate dehydrogenase gene (ldha), Pyruvate to ACCOA towards TCA cycle genes are suppressed. Amongst transcriptional regulators, fnr and dpiA are highly expressed where as arcA, ihfA-B are expressed slightly. (B) Comparison of the unmodified (solid bars) and modified GRN (hatched bars) with no mechanistic details. Transcriptional regulator arca is overexpressed and dpia is underexpressed. Genes corresponding to mixed acids are underexpressed and TCA cycle are overexpressed due to non-implementation of mechanistic details.

Mol. BioSyst.

This journal is © The Royal Society of Chemistry 2014

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

effect of the prevailing mechanisms on gene expression, the GRN was modified by representing all interactions as a monomer effector binding to a single site for all the genes. Fig. 2 also shows the fractional gene expression for the modified GRN lacking mechanistic details. It is clear from the plot that while the glycolytical pathway remains unaltered, several genes show expressions different from the unmodified GRN. Specifically, the genes involved in the TCA cycle are overexpressed while the mixed acid genes are repressed in the GRN that lacks mechanistic details. An interesting observation is that the fractional expression values for the transcriptional regulators arca and dpia also show a mismatch in the modified GRN. The discrepancies emerging due to the non-inclusion of mechanistic details in the expression of transcriptional regulators was eight fold overexpression in arca and three fold underexpression in dpia. The error in prediction of fractional expression levels in the transcription factors due to the exclusion of mechanistic details can cascade to other genes thereby compounding the amplitude of error for other down-stream genes in the GRN. Thus, the incorporation of molecular mechanisms is vital in quantifying the expression levels of genes in a GRN. Steady state analysis yields high correlation with microarray experiments Microarray experiments yielding the fold change gene expression values relative to WT for Dfnr, Darca and Darca–fnr mutants under anaerobic conditions are reported in literature.31 The individual single mutants Darca and Dfnr were used to obtain the parameter values, the double mutant Darca–fnr was used to validate the SSGES model. The predicted fractional protein expression from SSGES for the mutants was used to obtain fold change gene expression akin to microarray experiments and was compared with experimental microarray data.31 Fig. 3 shows the comparison of the microarray data for the three mutants. It is clear that SSGES could predict the fold change in expression with high correlation (Pearson Correlation Coefficient Z 80%). For several genes the match was quantitative, however, for some, there were discrepancies albeit a match in the trend. This can be ascribed to a lack of precise knowledge of the basal expression of genes in the wild type. Fig. 3A–C indicate that the glycolytic enzymes including phosphoenolpyruvate carboxylase (ppc) are not affected by the deletion of the two transcription regulators. Further the genes responsible for the conversion of pyruvate to acetyl coenzymeA leading to the TCA cycle are all leaky with a suppression of genes corresponding to mixed acids. The overexpression of the ldha gene indicates the role of transcription factors in lactate formation. We computed the Pearson Correlation Coefficient (PCC) for all 74 genes in the GRN between SSGES prediction and experimental microarray data and they were 80%, 90% and 80% for Dfnr, Darca and Darca–fnr mutants respectively. This indicates a high degree of qualitative as well as quantitative match. We also compared the sum of log fold changes of Darca and Dfnr with that of Darca–fnr to see the effect of individual deletion mutants in the combined mutant. We found 81% linear correlation between the sum of fold values of Dfnr and

This journal is © The Royal Society of Chemistry 2014

Paper

Darca mutants to that of the Darca–fnr double mutant indicating that the individual deletions had an additive effect in the double mutant for 81% of the genes. For experimental microarray data, the correlation was even higher, 92%. This is an interesting testable hypothesis whether the additive property in the genotype actually reflects in the phenotype by measuring the yield of metabolites regulated by these genes in the mutants. Fig. 3D–F depict the comparison between the Spearman Correlation Coefficient (SRCC) rank of log fold change values as predicted by the SSGES model and microarray data observations. For Dfnr and Darca–fnr mutants SRCC was 0.98 and for the Darca mutant SRCC was 0.8 with some of the genes being outliers. The outlier genes were ldha, gpmm, pta, acka, pyka, fumb, frda and frdc. It is not clear whether this discrepancy is due to incomplete structural representation of the GRN or error in the experimental data. However the rank correlation for all genes in Dfnr and Darca–fnr mutants was very high (98%). This indicates that the model as well as experimental data exhibited a high degree of monotonicity. SSGES was next used to predict the behavior of the GRN for the case when mechanistic details were lacking as in the modified network. The fold change expression for the three mutants Darca, Dfnr and Darca–fnr in the modified GRN deviated significantly from experimental data (see Fig. 4). In Fig. 4A, we notice that the linear correlation of the modified GRN with respect to microarray experiments for the Darca mutant is reduced to 77% when compared to the original GRN. Similarly, the linear correlation reduced to 65% and 72% for Dfnr and Darca–fnr respectively. The glycolytic genes did not show any significant change since these genes are regulated by monomer binding. However, the genes for mixed acid, the TCA cycle and pentose phosphate pathways are regulated through several mechanisms and show a significant deviation from the microarray data. Fig. 4D–F depict the monotonic correlation between the modified GRN and microarray experiments. We observe from the plots that the number of outliers have significantly increased with a reduced quality of fit with low R2 values of 0.63, 0.50 and 0.54 for mutants Darca, Dfnr and Darca–fnr respectively. The variance with respect to the microarray data for the modified GRN was also determined (see Fig. 5). In the modified GRN, Darca and Dfnr showed a maximum variance of 12% and 28% respectively while the unmodified GRN depicted a maximum variance of only 5%. This further reiterates the fact that mechanistic details are necessary for accurate quantification of GRNs. We also observe that Darca and Dfnr showed a higher variance mainly in the TCA cycle genes with a smaller variance in the mixed acid genes. The double mutant Darca–fnr demonstrated a high variance in both the mixed acid as well as TCA cycle genes (results not shown). SSGES prediction of transcription factor mutants helps in understanding the evolution of transcriptional control Experimental microarray data under anaerobic conditions are available only for three mutants, Darca, Dfnr and Darca–fnr. However the steady state modeling strategy described here can be used to predict the expression profiles for other mutants

Mol. BioSyst.

View Article Online

Molecular BioSystems

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

Fig. 3 Comparison of experimental31 (solid line) and predicted (dotted line) log fold values with respect to the wild type for mutants under anaerobic conditions. (A) Darca exhibits a 90% linear correlation; (B) Dfnr exhibits a linear correlation of 80%; (C) Darca–fnr exhibits a linear correlation of 80%. (D) Darca exhibits a monotonic correlation of 0.8 with some outliers; (E) Dfnr shows a monotonic correlation of 0.98; (F) Darca–fnr shows a monotonic correlation of 0.98.

lacking other combinations of transcription factors relevant for anaerobic respiration of glucose. We used SSGES to predict the microarray data representing the fold change value of the gene expression relative to the wild type for all 16 combinations of the key transcriptional factors (namely, fnr, arca, dpia, and ihfa-b). Fig. 6A depicts the predicted fold change for the sixteen possible mutants including single, double, triple and quadruple

Mol. BioSyst.

mutants as a heat map. It is clear that the genes for the glycolytic enzymes are robust and are not affected due to the knock out of the various transcription factors. The genes for pyruvate dehydrogenase and the TCA cycle enzymes are overexpressed in most mutants. However the changes in genes for the enzymes involved in the synthesis of mixed acid fermentation are specific to the mutations thus indicating that they are highly

This journal is © The Royal Society of Chemistry 2014

View Article Online

Paper

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

Fig. 4 Comparison of experimental31 (solid line) and predicted log fold values for modified GRN lacking mechanistic details (dotted line) with respect to the wild type for mutants under anaerobic conditions. We observe that genes for mixed acids, the TCA cycle and pentose phosphate pathways deviated significantly from experimental data. (A) Darca exhibits 77% linear correlation; (B) Dfnr exhibits a linear correlation of 65%; (C) Darca–fnr exhibits a linear correlation of 74%. (D) Darca exhibits a reduced quality of fit R2 error value of 0.63 with several outliers; (E) Dfnr shows a poor fit with a low R2 value of 0.5; (F) Darca–fnr also shows a low quality fit with several outliers with a low R2 error of 0.54.

dependent on transcription factor regulation. Most of the mutants involving arca gene knock out exhibit increased expression of gene ldha responsible for enzyme lactate dehydrogenase. Also the quadruple mutant, FAID, without any key transcription factors demonstrated increased expression of ldha with a lowered expression of genes required for acetate synthesis. Thus, the steady state analysis yielded quantitative

This journal is © The Royal Society of Chemistry 2014

information regarding the effect of regulation through the various transcription factors. Fig. 6B shows a similar log fold change for the various mutants with respect to WT for the modified GRN which does not include mechanistic details. It is clear that the heat map is distinctly different from that shown in Fig. 6A. It is also clear from the panel for WT in Fig. 6B that the adhe (ethanol formation)

Mol. BioSyst.

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

Molecular BioSystems

Fig. 5 Comparison of log fold changes of mutants with respect to microarray experiments for modified (hatched bars) and unmodified (solid bars) GRN. (A) The Darca GRN lacking mechanistic details depicts high variance for mixed acids genes. (B) The Dfnr modified GRN depicts high variance for TCA cycle genes.

Fig. 6 Microarray predictions for all mutants considering transcriptional activators FNR (F), ArcA (A), DpiA (D) and IHF (I) in the central metabolic pathway of E. coli under anaerobic conditions. Note that the nomenclature is set to indicate the lacking transcriptional activator. For example, F indicates the mutant lacking the fnr gene and FAID indicates the quadruple mutant lacking fnr, arca, ihfa-b and dpia genes. (A) The heat map of all predictions for the unmodified GRN. Changes in gene expression for the mixed acid pathway are mutant specific indicating that enzyme synthesis is transcription factor controlled. (B) The heat map of all predictions for modified GRN. There is down regulation of mixed acids in the WT whereas it is known that WT produces mixed acids under anaerobic conditions showing that molecular mechanisms are key for accurate prediction of the phenotype.

and pflb (formate synthesis) and frda-d family of genes (succinate production) are all down-regulated two to four fold and TCA cycle genes are up-regulated by four fold. Thus, a completely different expression profile for WT is observed when mechanistic details are neglected from the analysis. Further, in the quadruple mutant,

Mol. BioSyst.

the modified GRN predicts that the enzymes for acetate and lactate synthesis are not perturbed while the other mixed acids such as succinate, formate and ethanol are all down-regulated. This further strengthens the fact that molecular mechanisms do play an important role in governing expression profiles.

This journal is © The Royal Society of Chemistry 2014

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

Paper

Fig. 7 Hierarchical clustering of log fold change of gene expression for all deletion mutants of key transcriptional regulators of the central metabolic pathway under anaerobic conditions for E. coli. (A). Unmodified GRN – I = {WT}, II = {A,D,AD}, III = {AID,AI,I}, IV = {AF,FAD,FD,F}, V = {FI,FAID,FD}, VI = {DI}, VII = {FAI}. FAID mutant is far from WT. (B). Modified GRN – I = {AF,AD}, II = {I,DI,AI}, III = {AID,FAI,F}, IV = {FAD,FD}, V = {WT,D,A}, VI = {FAID,FID}, VII = {FI}. FAID and WT mutants are near to each other.

The above fold change values for all the sixteen mutants were clustered using the hierarchical clustering algorithm based on the Jaccardian distance, which is the most commonly used amongst all other distance measures.32–34 The analysis yielded seven clusters with the ordering of closeness in expression profiles of various genes (see Fig. 7). It was observed that the quadruple mutant (FAID) was placed further away from WT, with single, double and triple mutants distributed in between. It was observed that the triple mutant Dfnr–arca–ihfab (FAI), has the maximum perturbation compared to the wild type. Based on this clustering, a path from FAID to WT linking through the triple, double and single mutants was hypothesized (see Fig. 8A). The solid lines indicate the viable paths based on the ordering of the expression profiles generated by clustering. The analysis indicated the following possible paths for the evolution of transcription factor regulation depicted as I/F - AD/AF A/D - WT. However if similar analysis is performed using the modified GRN, the ordering yields a completely different grouping with the WT close to the quadruple mutant FAID since the expression profiles for WT and FAID showed similarity (see Fig. 6B). Also, the double and triple mutants are ordered higher in the grouping instead of the WT. In order to independently validate the above predicted path, we performed a phylogenetic ordering of the transcription factors and the ldha gene involved in anaerobic respiration in E. coli (see Fig. 8B). The analysis indicated an ordering of I - F - D - A - WT which is one of the paths hypothesized

This journal is © The Royal Society of Chemistry 2014

by the clustering of the expression profile as predicted by SSGES. This indicates that the degree of similarity in the gene expression can help in understanding the evolution of transcriptional control. We further look at the expression profiles of various genes in mixed acid metabolism for the mutants in the evolutionary path as predicted above (see Fig. 9). It is evident that through evolution the expression of genes for lactate dehydrogenase, pyruvate dehydrogenase and TCA cycle enzymes are suppressed with an increase in the mixed acid enzymes by about eight fold. However the enzymes for genes fumb and ppc show increased expression by 1.6 and 1.2 fold respectively to enable succinate production (through fumarate to malate conversion). Thus the hypothesized ordering of transcriptional activators from a quadruple mutant FAID to the WT suggests an evolutionary shift from a primarily lactate producing strain to a mixed acid synthesizing strain.

Discussion The gene regulatory network consists of several mechanisms that play a pivotal role in the expression profile of genes. Boolean and Bayesian approaches are used to typically model GRNs and correlate with microarray experiments. They do not make use of mechanistic details, while the ODE and Stochastic type of models require kinetic parameters and are used for small subnetworks of a GRN. SSGES, the method presented in

Mol. BioSyst.

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

Molecular BioSystems

Fig. 8 Prediction and phylogenetic validation of the evolutionary path of E. coli from a quadruple mutant FAID to the wild type WT. (A). Evolutionary ordering FAID - FAD/AID - A/D - WT and (B). Phylogenetic validation of evolution of transcriptional regulators in the taxonomy of E. coli.

Fig. 9 The heat map of mixed acid and TCA cycle genes for the mutants in the evolutionary path. Lactate (ldha) is repressed and mixed acid genes are expressed along the evolutionary path indicating that specific transcriptional regulators evolved to suppress the production of lactate and favoring the production of formate, ethanol and acetate.

this manuscript, is an attempt at quantifying a large GRN incorporating all known mechanistic details. This novel methodology helps in predicting microarray type experiments for various mutants. The steady state analysis of the GRN under anaerobic condition in E. coli has demonstrated the significance of the prevailing mechanisms in predicting the expression profile of genes. The relevance of including the mechanistic details was

Mol. BioSyst.

confirmed by validating the model using the experimental microarray data of fold change expression values of transcription factor mutants. Once validated, the model was used to predict the microarray fold change expression data for deletion mutants including the quadruple mutant FAID, where all the four key transcription factors ( fnr, arca, ihfa-b and dpia) were absent. Non-inclusion of mechanistic details reduced the quantitative correctness of the prediction. This indicates that

This journal is © The Royal Society of Chemistry 2014

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

molecular mechanisms play a vital role in accurately predicting quantitative profiles in a GRN. The Boolean analysis of a similar metabolic network had predicted a higher arca expression in WT under anaerobic conditions,31 although experimental microarray data indicated a reduced expression in vitro. Our steady state model could predict the observation accurately by including the feedback between fnr and arca and the stoichiometry of the interaction. Phenotypic characterization of Darca, Dfnr and Darca–fnr has been reported in the literature by evaluating fluxes in the metabolic network of E. coli under micro-aerobic conditions.35 These studies report a lower flux through PPP, PDH (pyruvate dehydrogenase responsible for pyruvate to acetyl coenzymeA conversion) and TCA pathways in all the mutants. The exchange coefficients of the Ppc flux were negligible in all the experimental strains. While the SSGES predictions match that of the PP pathway and PPc flux, the gene expression for PDH and TCA cycles were higher than the wild type strain in the mutants as observed by Bennett et al.36 in spite of the low flux in these pathways. Further a higher lactate flux was observed in the Darca–fnr mutant with lowering of flux in the mixed acids especially in the formation of formate and acetate (PFL and PTa-Acka pathways). The above observation corroborates with our prediction based on fold change expression values. This indicates that the gene expression profile can be directly linked to the phenotype of the organism. Thus, using our methodology, the prediction of the fold change protein expression value for the mutants where microarray experimental data are lacking can be used to predict phenotypic behavior. Therefore, we hypothesize based on the predicted gene expression profile that in a FAID mutant strain, the flux towards lactate would be considerably higher than the WT with a significantly reduced flux in other acids. The prediction of fold expression for all possible mutants involving the key transcription regulators (sixteen mutants in all) were hierarchically clustered. The analysis demonstrated the possible evolutionary path that clearly indicates a shift from a predominantly lactic acid secreting E. coli to the mixed acid producing organism with a suppression of PDH, TCA and glyoxylate shunt. It would be interesting to check the effect of this shift on the growth rates of these mutants and further if the growth rate increases along the evolutionary path as predicted by the analysis. The evolutionary path predicted by the fold expression matched the phylogenetic ordering of the sequences of ldha and the key transcriptional regulators. The GRN with non-inclusion of mechanistic details provided an evolutionary path which neither matched the phylogenetic study nor the pattern capturing the shift in the expression profile. As a matter of fact, it indicated that the WT strain was one of the least evolved with the quadruple mutant showing higher up in the evolutionary order. This was mainly due to a substantial lowering of expression of genes for mixed acid synthesizing enzymes and an increase in expression of the genes corresponding to the TCA cycle enzymes resulting in a prediction of a completely different expression profile relative to the WT. This further endorses the prediction of the evolution

This journal is © The Royal Society of Chemistry 2014

Paper

of transcription factors demonstrating the power of steady state analysis including molecular mechanisms in the study of genotype–phenotype relationships. The steady state analysis mainly requires two types of parameters, namely, the dissociation constant (Kd) characterizing DNA–protein and protein–protein interactions and the maximum protein concentration (Pmax). The perturbation indicated that the fold change expression values are more sensitive to Kd than Pmax (see ESI†). This implied that the values of Kd for various interactions in the GRN need to be ascertained for an accurate prediction. This study also indicated that the expression for glycolytic genes are robust to parametric perturbation in the transcriptional factors while the mixed acid genes are found to be more sensitive. This was also reflected in the mutant studies, wherein due to structural perturbations, the expression levels for glycolytic pathway were robust but the expression levels for the mixed acid pathway were sensitive. In our study, we have focused mainly on anaerobic respiration of E. coli and currently microarray experiments have been reported under various conditions such as aerobic to anaerobic shift,37 nitrate conditions (for WT and arcA mutants)38 and under varying oxygen levels.39 Our methodology can be used to predict and rationalize the expression profiles obtained in these experiments. Specific to anaerobic growth of E. coli, the focus of this study, SSGES has been used to obtain several biological insights. Using the in silico expression levels for all the combinations of mutants of the transcription factors ( fnr, arca, ihfa-b and dpia), we have predicted that the genes involved in mixed acid synthesis are predominantly controlled by them and will therefore influence the phenotype of the organism (such as growth). Further, we have hypothesized that the absence of these TFs would make E. coli a predominantly lactic acid secreting organism. By taking account of molecular mechanistic details, SSGES improved its accuracy to predict gene expression changes upon gene mutation. The study can therefore help in predicting the effect of gene mutant on a phenotype and thereby understanding the evolution of transcriptional control.

Conclusion The ability to predict the phenotype of an organism is enhanced if we are able to accurately predict the gene expression responsible for the realization of the particular phenotype. In that regard, the steady state modeling technique has enabled us to mimic important structural features in a GRN thereby improving the accuracy of prediction. This study has also helped us to hypothesize a possible evolutionary path of transcriptional control by comparing the log fold changes of gene expression for crucial enzymes in the central metabolic pathway under anaerobic conditions. This analysis has generated new testable hypotheses such as, how does expression or suppression of genes as predicted by the model affect phenotypes such as growth rate and secretion rate? Does deletion of more than one transcriptional regulators have an additive effect on the phenotype? How does removal of only feedback from a transcriptional regulator affect the

Mol. BioSyst.

View Article Online

Paper

Molecular BioSystems

gene expression of all the genes regulated by it? This leads to the design of a whole new set of experiments to construct these mutants and observe their behavior and assess the importance of transcriptional control over them. Thus by taking account of molecular mechanistic details, SSGES improved its accuracy to predict gene expression changes upon gene mutation and can help in understanding the evolution of transcriptional control.

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Methods The steady state analysis of GRN to predict expression profiles requires the interaction map of various genes present in the GRN. Given the interaction map, the mechanisms prevalent in a specific interaction need to be known. Further, the mechanisms have to be quantified assuming equilibrium interactions with known total protein concentrations. The main regulatory mechanisms in GRN include (i) protein–DNA binding—a protein can bind to the operator site of a gene for either activation or repression of the gene, (ii) protein–protein interactions—a protein can dimerize or even tetramerize before binding to the operator site representing the effect of stoichiometry, (iii) multiple operator site binding—the transcriptional activator can bind at multiple sites representing the effect of stoichiometry, (iv) cooperativity—wherein the binding of the first site changes the binding constant for the second site as has been illustrated in the l-phage system40 and (v) auto-regulation of TRs—a TR can activate or repress itself. This requires equilibrium constants for all interactions and total concentrations of effector proteins. The steady state modeling methodology to simulate the fold change in transcription as reported in ref. 20 was applied to model the central metabolic pathway of E. coli under anaerobic conditions. A short overview is detailed below. Consider the following regulatory interaction. Transcription regulator A forms a dimer. The dimer interacts with an operator site D and forms the complex DA2 which leads to gene transcription. Kp and Kd represent the respective equilibrium interaction constants for these interactions. A + A " A2, Kp

(1)

D + A2 " DA2, Kd

(2)

Since the state DA2 leads to gene transcription, the fractional probability of gene expression can be represented as follows, f ¼

DA2 Dt

(3)

where Dt is the total concentration of the operator site in the cell. The molar balances for the component species for the above interaction are as follows: Dt = D + DA2

(4)

At = A + 2A2 + 2DA2

(5)

The above equations in conjunction with the equilibrium relationships (as shown in eqn (1) and (2)) will yield the fractional probability of gene expression f (shown in eqn (3)).

Mol. BioSyst.

The following parameters, namely, the dissociation constant (Kp and Kd) as well as the total concentration terms Dt and At are required to obtain the solution. Further, the steady state translational expression fp is related to the transcription expression f through the co-response coefficient n as given below.41 fp = f n

(6)

The value of n is close to 1 for prokaryotic systems indicating a linear relationship between fp and f. However, eukaryotic systems have a value of n o 142 indicating translational inefficiency.§ The protein concentration P of the regulated gene under a given TR concentration can be estimated by evaluating fp as given below. P = fp  Pmax

(7)

where Pmax is the maximum protein concentration in the cell. We assume that the overall effective transcription probability is the product of the individual transcription probabilities in case a specific operator has multiple binding sites and is regulated by different regulators. In this case, the TRs are assumed to act independently as given below, f = f1  f2      fn

(8)

where n is the number of TRs for a specific operator site and f1, f2,. . .fn denote the independent transcription probabilities. The SSGES simulator20 can be used with the information regarding the network structure and parameters to generate the transcription probability f of various genes for a given set of TR concentrations. Further, the transcription probability can be used to determine the log fold expression ratio by comparing the probability values for the condition of interest and a reference condition. If fir denotes the transcription probability for ith gene under a reference condition r and fic denotes the transcription probability of the same gene under a condition of interest c, then the log fold transcription ratio (LFT) for gene i is given by, LFTi ¼ log10

fic fir

(9)

The details of SSGES modeling of various structural features for the central metabolic pathway of E. coli under anaerobic conditions is given in the ESI† (see Table S3). The input to SSGES comprised all gene–TF and protein–protein interactions for the entire GRN along with the parameters equilibrium constants and maximum protein concentrations. The steady state analysis of 74 genes in the GRN yielded 148 sets of total molar balances and 901 sets of equilibrium relationships which need to be solved simultaneously (see files module1.m and module1_fun.m generated by SSGES based on the input GRN given in file network.txt). The initial values of the parameters were set as follows—(i) the § We also studied the sensitivity of the co-response coefficient on the predicted fold value of gene expression by varying n uniformly randomly in the ranges (0.5–2 and 0.5–5). For 1000 iterations, we computed the mean and standard deviation of correlation coefficients and plotted them (see Fig. S4 in ESI†). We also computed the relative change in correlation when n = 1 (our case). The results indicate that the change is less than 10% (trends are preserved) for the (0.5–2) range and less than 13.3% for the (0.5–5) range again maintaining the trends.

This journal is © The Royal Society of Chemistry 2014

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Molecular BioSystems

DNA–protein interaction constants Kd for all the sites were set to a typical value of 109 M as observed in the E. coli system;22 (ii) protein–protein equilibrium interaction constants Kp were set to 108 M for all the protein–protein interactions pairs as in the E. coli system,22 (iii) each of the operator site concentrations was set to 3.29  1011 M considering one molecule per cell22 and (iv) the maximum protein concentration Pmax values for the genes were obtained from literature wherever available and for those that were unknown it was set of 106 M. Under these conditions, the model provides the probability of gene expression as the output for a given total concentration of each TF. The simulator sets up steady state algebraic equations for the given input file containing the network structure and the parameter values. Molar balances are set up for all the species in the network including all the complexes arising from DNA–protein and protein–protein interactions. Further, all the concentrations of the complexes are related through equilibrium dissociation constants to the free species concentrations. The set of algebraic equations generated by SSGES are solved on the MATLAB platform using lsqnonlin to generate initial guesses and fsolve to rapidly converge to a final solution. The transcriptional and translational expressions and protein levels in the GRN forms the simulator output. For each of the mutants Dfnr, Darca and Darca–fnr, we generated a GRN by making the operator concentration of the corresponding gene to be zero. We vary Kd and Kp to obtain a high correlation match with actual microarray experiments for these mutants with respect to the wild type under anaerobic conditions.31 Out of 901 equilibrium binding constants, only 72 constants were tuned so as to match the fold change predictions to microarray experiments for the 74 genes while the others were assumed or fixed from literature. We make use of the parameter values to predict the absolute probability of expression for the wild type strain. Further, we make use of the relative change in the expression levels for a mutant relative to the wild type to obtain fold change expression akin to microarray data. Similar predictions were obtained for several mutants including the double mutant Darca–fnr where we could validate our prediction by matching it to experimental data. Thus, while individual single mutants Darca and Dfnr were used to obtain the parameter values, the double mutant Darca–fnr was used to validate the SSGES model. The GRN for the wild type obtained after tuning is given in ESI† (see file network.txt). Perturbation analysis was also performed by varying the Kd and Pmax values over a range for the key transcriptional regulators and the root mean square error (RMSE) between the tuned and the perturbed values for gene expression was computed for WT and RMSE between the perturbed and microarray experimental values for log fold changes with respect to the wild type was computed for the transcription factor mutants Dfnr, Darca and Darca–fnr (see ESI† for details). Comparison metrics for validation with microarray experiments The PCC (Pearson Correlation Coefficient) indicates the similarity in the model and experimental profiles. A PCC score of 1 indicates complete correlation, a score of 0 indicates no correlation whereas a score of 1 indicates anticorrelation. We find PCC correlation between the SSGES predicted fold

This journal is © The Royal Society of Chemistry 2014

Paper

change values and the reported microarray experiments. If X represents the log fold change values as predicted by SSGES and Y is the corresponding values reported by microarray experiments, then the Pearson Correlation Coefficient (rX,Y) is given by rX;Y ¼

CovðX; YÞ sX ; sY

(10)

where Cov(X,Y) is the covariance of X and Y and sX and sY are standard deviations of X and Y respectively. The SRCC (Spearman Rank Correlation Coefficient) is another metric that indicates correlation between any two data sets. It is a non-parametric measure of statistical dependence between two variables. It is defined as the PCC between the ranked variables. While PCC assumes linear variation, SRCC assumes monotonic relationship between the datasets. We measure the SRCC for those genes that directly regulate the reactions. For a sample of size n, if the n raw values Xi,Yi are converted to ranks xi,yi, then the Spearman Coefficient rS is given by P ðxi  xÞðyi  yÞ i (11) rS ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P ðxi  xÞ2 ðyi  yÞ2 i

where x% and y% are means of ranks. Computing variance in log fold changes If the log fold change for a particular gene and a specific mutant as predicted by the given condition (with or without structural features) is represented as LFCond and if the corresponding experimentally observed log fold change is depicted as LFExpt then the variance is given by, variance = (LFCond  LFExpt)2

(12)

Phylogenetic tree generation In order to generate the phylogenetic tree, we used the UNIPROT43 database to generate the sequences for proteins corresponding to genes ldha, fnr, ihfA-B (himA-D), arcA and dpiA for Escherichia coli. Further, we ran multiple sequence alignment (MSA), pairwise protein distance matrix (protDist) computation and NeighborJoining44,45 algorithms from the clustalw46 package to determine the phylogenetic tree.

Acknowledgements The authors would like to thank the Department of Science and Technology grant number SR/WOS-A/ET-73/2011 for funding this work. We also thank Dilip Durai for his help in generating the phylogenetic tree.

References 1 L. Chen, R. Wang, T. J. Kobayashi and K. Aihara, Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys., 2004, 1–13. 2 M. Levine and E. H. Davidson, Proc. Natl. Acad. Sci. U. S. A., 2005, 102, 4936–4942.

Mol. BioSyst.

View Article Online

Published on 25 November 2013. Downloaded by Aston University on 12/01/2014 00:12:26.

Paper

3 P. B. Madhamshettiwar, S. R. Maetschke, M. J. Davis and A. R. M. A. Ragan, Genome Med., 2012, 4, 4–41. 4 V. A. Rhodius and R. A. LaRossa, Curr. Opin. Microbiol., 2003, 114–119. 5 W. Chang, C. Li and B. Chen, BMC Bioinf., 2004, 44–62. 6 X. Li, S. Rao, W. Jiang, C. Li, Y. Xiao, Z. Guo, Q. Zhang, L. Wang, L. Du, J. Li, L. Li, T. Zhang and Q. K. Wang, BMC Bioinf., 2006, 26–46. 7 J. Courcelle, A. Khodursky, B. Peter, P. O. Brown and P. C. Hanawalt, Genetics, 2001, 41–64. 8 M. B. Eisen, P. T. Spellman, P. O. Brown and D. Botstein, Proc. Natl. Acad. Sci. U. S. A., 1998, 14863–14868. 9 O. Alter, P. O. Brown and D. Botstein, Proc. Natl. Acad. Sci. U. S. A., 2000, 8944–8949. 10 H. de Jong, J. Comput. Biol., 2002, 67–103. 11 X. Xu, L. Wang and D. Ding, FEBS Lett., 2004, 297–304. 12 H. Yu, M. N. Luscombe, J. Qian and M. Gertein, Trends Genet., 2003, 422–427. 13 T. Akutsu, S. Miyano and S. Kuhara, Pac. Symp. Biocomput., 1999, 17–28. 14 S. Imoto, T. Goto and S. Miyano, Pac. Symp. Biocomput., 2002, 175–186. 15 D. J. Michaud, A. G. Marsh and P. S. Dhurjati, Bioinformatics, 2003, 1140–1146. 16 T. Chen, L. H. Hongyu and G. M. Church, Pac. Symp. Biocomput., 1999, 29–40. 17 J. Cao and H. Zhao, Bioinformatics, 2008, 24, 1619–1624. 18 I. Shmulevich and J. D. Aitchison, Methods Enzymol., 2009, 467, 335–356. 19 G. Karlebach and R. Shamir, Nat. Rev. Mol. Cell Biol., 2008, 9, 770–780. 20 S. Rawool and K. V. Venkatesh, BioSystems, 2007, 636–655. 21 J. Hasty, D. McMillen, F. Isaacs and J. J. Collins, Nat. Rev. Genet., 2001, 2, 268–279. 22 S. Bhartiya, S. Rawool and K. V. Venkatesh, Eur. J. Biochem., 2003, 270, 2644–2651. 23 F. Jacob and J. Monod, J. Mol. Biol., 1961, 3, 318–356. 24 M. Verma, P. J. Bhat and K. V. Venkatesh, J. Biochem., 2003, 48764–48769. 25 S. Ostergaard, L. Olsson, M. Johnston and J. Nielsen, Nat. Biotechnol., 2000, 1283–1286. ´rrez-Rı´os, D. A. Rosenblueth, J. A. Loza, A. M. 26 R. M. Gutie Huerta, J. D. Glasner, F. R. Blattner and J. Collado-Vides, Genome Res., 2003, 13, 2435–2443. 27 I. M. Keseler, A. Mackie, M. Peralta-Gil, A. Santos-Zavaleta, S. Gama-Castro, C. Bonavides-Martinez, C. Fulcher, A. M. Huerta, A. Kothari, M. Krummenacker, M. Latendresse,

Mol. BioSyst.

Molecular BioSystems

28

29 30 31 32 33

34 35 36 37 38 39

40 41 42 43 44 45 46

˜iz-Rascado, Q. Ong, S. Paley, I. Schroder, A. Shearer, L. Mun P. Subhraveti, M. Travers, D. Weerasinghe, V. Weiss, J. Collado-Vides, R. P. Gunsalus, I. Paulsen and P. D. Karp, Nucleic Acids Res., 2013, 41, D605–D612. S. Gama-Castro, H. Salgado, M. Peralta-Gil, A. Santos-Zavaleta, L. Muiz-Rascado, H. Solano-Lira, V. Jimenez-Jacinto, V. Weiss, ´pez-Fuentes, L. Porro ´n-Sotelo, J. S. Garcı´a-Sotelo, A. Lo ´ndez, A. Medina-Rivera, I. Martı´nez-Flores, S. Alquicira-Herna ´ndez, R. Martı´nez-Adame, C. BonavidesK. Alquicira-Herna Martı´nez, J. Miranda-Rı´os, A. M. Huerta, A. Mendoza-Vargas, L. Collado-Torres, B. Taboada, L. Vega-Alvarado, M. Olvera, L. Olvera, R. Grande, E. Morett and J. Collado-Vides, Nucleic Acids Res., 2011, 39, D98–D105. J. Schellenberger, J. Park, T. Conrad and B. Ø. Palsson, BMC Bioinf., 2010, 11, 1–10. G. Jianga, S. Nikolova1 and D. P. Clark, Microbiology, 2001, 147, 2437–2446. M. W. Covert, E. M. Knight, J. L. Reed, M. J. Herrgard and B. O. Palsson, Nature, 2004, 429, 92–96. C. Forst, C. FLamm, I. L. Hofacker and P. F. Stradler, BMC Bioinf., 2006, 7, 67–78. L. Liao, S. Kim and J. Tomb, Proc. of the 6th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, 2002, pp. 469–476. H. Ma and A. Zeng, Mol. Phylogenet. Evol., 2004, 31, 204–213. J. Zhu, S. Shalel-Levanon, G. Bennett and K. Y. San, Metab. Eng., 2006, 8, 619–627. S. Shalel-Levanon, K. San and G. Bennett, Biotechnol. Bioeng., 2005, 92, 147–159. ´zsi, Z. N. Oltvai and J. Ernst, Q. K. Beg, K. A. Kay, G. Bala Z. Bar-Joseph, PLoS Comput. Biol., 2008, 4, e1000044. Y. Toya, K. Nakahigashi, M. Tomitaab and K. Shimizu, Mol. BioSyst., 2012, 8, 2593–2604. M. Rolfe, A. T. Beek, A. I. Graham, E. W. Trotter, H. M. Asif, G. Sanguinetti, J. T. de Mattos, R. K. Poole and J. Green, J. Biochem., 2011, 286, 10147–10154. G. Ackers, A. Johnson and M. Shea, Proc. Natl. Acad. Sci. U. S. A., 1982, 1129–1133. M. Verma, P. J. Bhat, S. Bhartiya and K. V. Venkatesh, Eur. J. Biochem., 2004, 4064–4074. D. A. Fell, Trends Genet., 2001, 17, 680–683. UniProt, Nucleic Acids Res., 2012, 40, 71–75. N. Saitou and M. Nei, Mol. Biol. Evol., 1987, 4, 406–425. O. Gascuel and M. Steel, Mol. Biol. Evol., 2006, 23, 1997–2000. J. Thompson, D. Higgins and T. J. Gibson, Nucleic Acids Res., 1994, 22, 4673–4680.

This journal is © The Royal Society of Chemistry 2014

Steady state analysis of the genetic regulatory network incorporating underlying molecular mechanisms for anaerobic metabolism in Escherichia coli.

A Gene Regulatory Network (GRN) represents complex connections between genes in a cell which interact with each other through their RNA and protein ex...
5MB Sizes 0 Downloads 0 Views