338

DOI 10.1002/pmic.201300257

Proteomics 2014, 14, 338–352

REVIEW

Dynamic protein interaction network construction and applications Jianxin Wang1 , Xiaoqing Peng1 , Wei Peng1 and Fang-Xiang Wu2 1 2

School of Information Science and Engineering, Central South University, Changsha, P. R. China Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, Canada

With more dynamic information available, researchers’ attention has recently shifted from static properties to dynamic properties of protein–protein interaction networks. To compensate the limited ability of technologies of detecting dynamic protein–protein interactions, dynamic protein interaction networks (DPINs) can be constructed by involving proteomic, genomic, and transcriptome analyses. Two groups of DPIN construction methods are classified based on the different focuses on dynamic information extracted from gene expression data. The dynamics of one kind of DPINs is reflected by the changes in protein presence varying with time, while that of the other kind of DPINs is reflected by the differences of coexpression under different conditions. In this review, the applications on DPINs will be discussed, including protein complexes/functional modules and network organization analysis, biomarkers detection in the progression or prognosis of the disease, and network medicine. We also point out the challenges in DPINs construction and future directions in the research of DPINs at the end of this review.

Received: June 29, 2013 Revised: October 23, 2013 Accepted: November 27, 2013

Keywords: Bioinformatics / Biomarker / Dynamic protein interaction networks / Gene expression / Network medicine / Protein complex

1

Introduction

In responding to a stimulus or a new condition, not only are the amounts and locations of proteins change [1], but also protein–protein interactions (PPIs) are changing. For example, the interaction partners of FoxO3A are changing under different growth conditions [2] and PPIs change during the different steps of RNA splicing process [3]. As shown in Fig. 1, a protein interacts with different proteins under

Correspondence: Dr. Jianxin Wang, School of Information Science and Engineering, Central South University, Changsha 410083, P. R. China E-mail: [email protected] Fax: 086-731-88830212 Abbreviations: DCPI, differently coexpressed protein interaction; DPIN, dynamic protein interaction networks; PCC, Pearson correlation coefficient; PIN, protein–protein interaction network; PPI, protein–protein interaction; SCC, Spearman correlation coefficient; SDEG, significantly differently expressed genes; TPPI, transient PPI; T2DM, type 2 diabetes mellitus; Y2H, yeast two-hybrid

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

different conditions or at different time points. PPIs are of central importance in every biological process in a living cell. Many important molecular processes in the cell are carried out by large, sophisticated multimolecular machines [4], such as the anaphase-promoting complexes, RNA splicing and polyadenylation machinery, protein export and transport complexes, which are formed by a large number of interacted proteins. How the system in a cell behaves over time or under various conditions can be explained by the dynamics of protein–protein interaction network (PIN). Depending on their lifetime, PPIs can be classified into permanent or transient PPIs (TPPIs) [5]. Permanent PPIs are usually very stable and irreversible, while TPPIs associate and dissociate quickly [5]. TPPIs provide a mechanism for the cell to quickly respond to extracellular stimuli. TPPIs are crucial for diverse biological processes, such as hormonereceptor binding, signal transduction, allostery of enzymes, inhibition of proteases, and correction of misfolded proteins by chaperones in the cell. TPPIs are mostly dynamical, since Colour Online: See the article online to view Figs. 1 and 2 in colour.

www.proteomics-journal.com

Proteomics 2014, 14, 338–352

339

Figure 1. Dynamic of protein–protein interactions.

the proteins of TPPIs can easily change interaction partners and the lifetime of these interactions is short. PTMs also contribute to the dynamics of PINs, which can change the stability, conformation, and function of proteins, and play crucial roles in regulating the diverse PPIs involved in essentially every cellular process. Some proteins are activated by proteolytic processing and conformational change with cleavage at distinct sites in proteins. Additionally, proteins can be degraded by PTM-dependent proteolysis, for example, the polyubiquitylation-mediated degradation. Alternatively, some PPIs, including a subset of functionally important TPPIs, require specific PTMs to recognize their binding partners and stabilize their conformation. For example, phosphorylation is important for activity of SH2 domains or the 14-3-3 protein family that can bind only to phosphorylated domains [6, 7]. There are some attempts on investigation of PPIs mediated by stable PTMs, such as phosphorylation- and methylation-dependent PPIs [8]. Furthermore, the reversible multisite PTMs are also involved in the assembly of protein complexes and behave as a switch to modulate the binding specificity of a protein to interact with different partners in different processes, such as the p53-Mdm2–HAUSP complex [9]. In Fig. 2, there are some PTM-dependent PPIs in which the PTMs serve as a binding recognition or modulator to allow a protein to interact with different or multipartners. Based on the “just-in-time” mechanism of protein complex conformation [10], Jensen et al. investigate the phosphorylation of static and dynamic proteins in protein complexes, and find that protein complexes involved in the cell cycle are similar for all eukaryotes, while they differ in

Figure 2. PTM-dependent PPIs.

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

PTMs and transcriptional control [11]. PTMs can occur at any step in the “life cycle” of a protein and be often transient, which provides a way to rapidly and dynamically regulate protein activities, the assembly–disassembly of macromolecular complexes, and the translocation of proteins between cellular compartments [12]. A large amount of PPIs have been identified by many technologies, including stable PPIs and TPPIs, such as the yeast two-hybrid assay (Y2H), MS experiments, and the protein chip technology [13–16]. Since TPPIs occur instantaneously and are formed and broken easily, there are technical challenges in the identification of TPPIs [14, 15, 17]. Some high-resolution analysis techniques [16, 18–24], which can be applied to detect TPPIs, have limited throughput and restrict constrains. Y2H is an indirect and most popular way to identify TPPIs with high-throughput but also has high falsepositive rate [13, 17, 25]. Different Y2H experiments have low overlap of detected PPIs [13], and there is a finding that PPIs identified in a single experiment are transient in nature [26]. Consecutive Y2H screening can capture TPPIs that cannot be detected in a single Y2H experiment and increase the overlap between different data sets [26, 27]. Some PPIs, including a subset of functionally important TPPIs, require specific PTMs to recognize their binding partners and stabilize their conformation. However, they are not easy to be detected experimentally. A Y2H screen for the analysis of modificationspecific binary PPIs has been developed [28]. Recently, a method is proposed to identify PTM-dependent PPIs by combining a photo-cross-linking strategy with SILAC-based quantitative MS [29]. So far, PPIs have been studied from many perspectives, such as biochemistry, quantum chemistry, molecular dynamics, chemical biology, signal transduction, and metabolic or genetic/epigenetic networks [30]. Some researchers in proteomics research field initiate proteomics specification in time and space consortium, which aims to investigate the human proteome in time and space by integrating existing technological platforms within MS, cryo-electron microscopy, and cell imaging and driving technology [31]. They propose a new “third-generation” proteomics strategy that is a powerful and versatile set of assay systems for characterizing proteome dynamics and offers an indispensable tool for cell biology and molecular medicine. With multiple PPIs identified under different conditions or at different time points by different technologies, a PIN simply assembled by PPIs does not account for spatial and

www.proteomics-journal.com

340

J. Wang et al.

temporal aspects, and thus does not obviously reflect the actual situation in a cell. In a cell, a PIN is not just a stable and static assembly of proteins and their interactions, but also changes over time, environments, and different stages of cell cycle [32], which is termed a dynamic PIN (DPIN). Constructing a DPIN in a cell, which involves the dynamics of proteins and PPIs, will improve our understanding of diseases and can provide the basis for new therapeutic approaches. Recent studies in construction of DPIN have drawn much attention of researchers for clinical reasons. Disease progression is dynamic, involving differentially expressed genes and proteins during different periods. DPIN can illustrate how the onset and progression of disease are reflected in the form of differentially expressed genes and their associated PINs. Then, the extracted molecular signatures of a disease can be identified against a time-averaged background. Ultimately, DPIN may allow the detection of a disease prior to the development of clinical symptoms, thus paving a way to prophylactic therapies. Additionally, the drug development for complex diseases is shifting from targeting individual proteins or genes to systems based attacks targeting dynamic network states. Recently, more and more researchers focus on the drug target dynamic molecular networks and highlight the importance of studying the dynamics of the system to be targeted, including the time dependency and the order of treatments. The age of “Network Medicine” has clearly begun [33]. To construct DPINs, many computational approaches have been proposed by integrating multiple types of quantitative data with advanced computational network modeling methods. Up to now, the currently available PPIs are derived from studies of unmodified proteins, and gene expression data can provide a way for investigating the dynamics of proteins on a large scale, while other data about protein dynamic properties are comparatively low throughput. The purpose of this review is to summarize the general approaches for constructing DPINs and to help readers keep up with recent and important developments in the field. The paper is organized as follows. First, two groups of methods for constructing DPIN, based on gene expression, are introduced. After that, the state-of-the-art applications on DPINs will be discussed, including identifying protein complexes and functional modules, mining biomarkers, and predicting the drug target in DPINs. With the development of technologies, more dynamic information about proteins and PPIs will be available, such as protein amount, protein localization, and PTMs. Challenges and directions for future research will be discussed in the setting of the availability of these data.

2

Methods based on protein presence dynamics

Certain genes are expressed only at specific stages of the cell cycle or under certain conditions, so do their translated  C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Proteomics 2014, 14, 338–352

proteins. Thus in a living cell, the proteins and interactions appear and disappear in the PIN, changing with protein presence dynamics. Consequently, a direct way to construct DPINs is based on protein presence dynamics. As shown in Fig. 3, the original PIN is constructed by the interactions identified at different time points or conditions in different experiments. The dynamic information of protein presence varying with time points or conditions can be derived from gene expression data. Mapping the proteins expressed at different time points or conditions to a static PIN generates a series of subnetworks that constitute a DPIN, and this type of DPIN provides a straight way to visualize the dynamic changes of proteins and interactions varying with time or conditions. For each time point or condition, there is a subnetwork constituted by the interacting proteins which are expressed at the same time point or condition. Different DPINs are constructed with different strategies to identify the expressed proteins at each time point or condition, and the comparisons among different DPINs are shown in Table 1. de Lichtenberg et al. [10] first construct a DPIN over the yeast mitotic cell cycle with temporal information extracted from time series microarray experiment of the cell cycle [34, 35]. PPIs are sourced from Y2H screens [36, 37], complex pull-down screens [38,39], and MIPS complexes [40]. To reduce the error rate, a topology-based confidence score is assigned to each PPI, and only high-confidence PPIs are selected. A further filter of PPIs is to exclude PPIs between proteins with incompatible subcellular localizations. The rest PPIs constitute a high-quality PIN. Then, based on the study in [41], a set of periodically expressed genes are derived from the time series gene expression and each gene is assigned to a time point when it has expression peak in the cell cycle. Therefore, proteins are classified into two group: periodically expressed (dynamic) proteins and nonperiodically expressed (static) proteins. For each time point, the periodically expressed proteins—which are assigned to the time point—and all nonperiodically expressed proteins and their interactions in the static PIN are reserved. In the DPIN, each protein has only one color, while different colors represent different time points in the cell cycle, and for each dynamic protein, the appearance time is its time of peak expression. There are only 300 proteins in the DPIN, compared with nearly 5000 proteins in yeast proteome. Obviously, the small high-quality PIN obtained by the error rate reduction and the strategy of identifying periodically expressed genes [41] have an impact on the scale of the DPIN. PPIs are changing with different growth or environmental conditions, and the dynamics of PPIs can provide an insight into the cellular response. A DPIN of Escherichia coli, consisting of four conditional PINs, is constructed by Hegde et al. [42]. The gene expression experiments of four conditions are obtained from Stanford Microarray Database [43]. A chip in Stanford Microarray Database contains 16 possible sectors. Considering the different environments of sectors within a chip, the noise in gene expression is inversely proportional to the mean expression level [44] and essential genes have lower www.proteomics-journal.com

341

Proteomics 2014, 14, 338–352

Figure 3. An example of time point subnetworks involved DPIN.

Table 1. DPIN construction methods based on protein presence dynamics

de Lichtenberg et al. [10]

Gene expression data

Static PIN

The appearance time of proteins

Number of proteins in DPIN

Yeast mitotic cell cycle [34, 35]

A filtered PIN with PPIs from Y2H screens [36, 37], complex pull-down screens [38, 39], and MIPS complexes [40]

For periodically expressed proteins, T( p) = {i|Max(E xp( p, i)), i = 1, . . . , t}, where Exp (p,i) is the gene expression value of gene/protein at time point i; for nonperiodically expressed proteins, T( p) = {i|i = 1, . . . , t}

Only 300 proteins in the whole DPIN

ns

i,j 

Hegde et al. [42]

Gene expression data of Escherichia coli under four conditions [52]

E. coli PIN with 3682 proteins and 78 048 interactions [46]

Tang et al. [47]

GSE3431(budding yeast metabolic cycle) [48]

Yeast PIN from DIP [53] (2010/10/10) with 4950 proteins and 21 788 interactions

Wang et al. [50]

GSE3431 (budding yeast metabolic cycle) [48] GSE4987 (the yeast cell cycle) [51]

Yeast PIN from DIP [53] (2012/02/28) with 5023 proteins and 22 570 interactions

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Tth (si,j ) =

E xp( p,i)

p

|nsi,j |

, where si,j is a

section j in a chip under condition i, and nsi,j is the gene set in section j; T( p) = {i|E xp( p, i) ≥ Tth (si,j ), i = 1, . . . , 4}, where p belongs to section j in a chip when under condition i Tth = C, where C is a constant value; T( p) = {i|E xp( p, i) ≥ Tth , i = 1, . . . , t}

2

␴ ( p) Tth ( p) = ␮( p) + 3␴( p) × ( 1+␴ 2 ( p) ), where ␮( p) and ␴( p) are the algorithmic mean and the SD of p’s expression values, respectively; T( p) = {i|E xp( p, i) ≥ Tth ( p), i = 1, . . . , t}

Average in four conditional subnetworks: 1917 proteins and 33 746 interactions

Average in time point network of DPIN: 3520 proteins and 14 904 interactions GSE3431: average in time point subnetwork: 749 proteins and 1126 interactions GSE4987: average in time point subnetwork: 1860 proteins and 5173 interactions

www.proteomics-journal.com

342

J. Wang et al.

noise in their expression [45], Hegde et al. [42] use the median signal intensity of spots within each sector as a cutoff. Thus, a gene is considered to be expressed if the net signal intensity of its spot is more than or equal to the median signal intensity of spots within the sector. Each conditional PIN of the DPIN is constructed by mapping the expressed proteins under each condition on an existing static PIN of E. coli [46]. About half of proteins in E. coli proteome appear in each conditional PIN of the DPIN. The DPIN is used to study the dynamic changes of PPIs in response of changing environment conditions. Rather than assigning the time point of expression peak as the presence time of each periodically expressed protein, Tang et al. [47] take a potential threshold to identify the expression time points of each protein, based on the observation that a large number of periodically expressed genes in the budding yeast metabolic cycle [48] have expression peaks greater than a constant value. The potential threshold is supposed to filter the noise from gene expression profiles and retain merely the most biologically significant gene products in a DPIN. All mRNA levels are compared with the potential threshold, and the proteins with low mRNA expression peak will be filtered out. A protein is considered to be expressed and appear at a time point only if its expression level at that time point is greater than the predefined threshold. For each time point, the interactions in the original yeast PIN are maintained only if both of the interacting proteins are expressed. Hence there is a subnetwork corresponding to each time point in the budding yeast metabolic cycle. The DPIN is constituted by these subnetworks over the metabolic cycle. The differences among these subnetworks are supposed to depict the dynamic changes of protein expression. The time point subnetworks of DPIN have 3520 proteins and 14 904 interactions on average. It indicates there are about 70% proteins expressing at each time point. However, the value of potential threshold in [47] is hard to apply on arbitrary gene expression profiles, since it depends on the analysis of periodically expressed genes as in [48]. A certain relationship between mRNA expression and protein abundance values is not available yet [49]. Some mRNAs with a low transcription level can also be translated into proteins. However, those proteins with low expression peak filtered by a relatively higher threshold, such as in [47], will be improperly missed in DPIN. To overcome this shortcoming, a three-sigma principle is proposed by Wang et al. [50] to design a threshold for each gene by considering its own characteristic expression curve and the inevitable noise in gene expression array. An expressed protein might not interact with other proteins, because only proteins in their active form can interact with each other. Thus, their strategy aims to identify the time points when a protein is expressed and active. First, the three-sigma principle is used to differentiate the inactive and active time points of a protein based on a time series gene expression profile. After identifying the active time points of each protein, they construct time point subnetworks by mapping the active proteins at the time point on a PIN. Thus,  C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Proteomics 2014, 14, 338–352

a DPIN consists of a series of time point subnetworks over cell cycle. In [50], two DPINs of yeast are constructed based on two different gene expression data sets, GSE3431 (gene expression profiles over a budding yeast metabolic cycle with 12 time points) [48] and GSE4987 (gene expression profiles over a yeast cell cycle with 50 time points) [51]. It can be found that the average number of proteins appearing in time point subnetworks varies with different gene expression data sets. In the DPIN of a budding yeast metabolic cycle [48], there are about 15% proteins appearing in each time point subnetwork, while in the DPIN of a yeast cell cycle [51], there are about 37% proteins appearing in each time point subnetwork. It seems that the three-sigma principle has different performances on over- and undersampled time series gene expression data sets. Compared to the method in [10], the construction methods of DPINs in [42, 47, 50] use a threshold to eliminate the noisy expression and identify expressed genes varying with time or response to different growth conditions, without treating periodically expressed proteins and nonperiodically expressed proteins discriminatingly. DPINs even based on the same time series gene expression profile and static PIN can be constructed differently by different strategies to identify the expressed proteins at each time point. The network scales and densities of DPINs constructed by Tang et al. [47] and Wang et al. [50] are quite different although they are based on the same time series gene expression profiles [48] and the similar static PINs. The protein complex prediction on the DPIN constructed by Wang et al. [50] is quite better than that constructed by Tang et al. [47]. Hence it might suggest that the network scale and density can be used to measure the quality of different DPINs derived from the same data sets.

3

Methods based on coexpression alteration

Changes in protein expression can reduce or increase interactions between proteins, leading to deviation from the normal physiological state. Therefore, another type of DPIN construction methods is based on alterations in coexpression, which reflects the dynamics changes of PPIs between different physiological states. Coexpression can be measured by both normalized difference and correlation. Normalized difference is computed for absolute expression levels, while correlations are analyzed for expression profiles with relative expression levels. The normalized difference Di (x, y) of genes x and y is defined as follows [54]:   Exp(x, i) − Exp(y, i) , (1) Di (x, y) = Exp(x, i) + Exp(y, i) where Exp(x, i) and Exp(y, i) are the expression levels of genes x and y under condition i, respectively. Values for the normalized difference range from 0 to 1. A value near 0 indicates the high similarity of absolute mRNA levels of gene x and gene www.proteomics-journal.com

343

Proteomics 2014, 14, 338–352

y under condition i. However, it has limited applications. It can be applied on one snapshot of gene expression, but it is hard to be applied to measure the coexpression of two genes in expression profiles with many samples. Pearson correlation coefficient (PCC) is a popular correlation method to measure the coexpression of coding genes between each pair of interacting proteins in expression profiles, shown as Eq. (2).   n 1  E xp(x, i) − E xp(x) PCC(x, y) = n − 1 i=1 ␴(x)   E xp(x, i) − E xp(y) , × ␴(y)

(2)

where n is the number of samples in a gene expression profile, Exp(x, i) and Exp(y, i) are the expression levels of genes x and y in sample i, respectively, E xp(x) and E xp(y) represent the average expression levels of genes x and y, respectively, and ␴(x) and ␴(y)are the SDs of the expression levels of genes x and y, respectively. It has a value of 1 for perfect correlation or −1 for perfect anticorrelation. A value of 0 shows there is no linear relationship between two genes. Based on the observation that dynamic proteins tend to be in neighborhoods of highly coexpressed proteins, a simple DPIN constructed by Komurov and White [55] is composed by interacting pairs of proteins with PCC ≥ 0.65. Expression variance (EV) is used to measure the dynamics of protein, such that a low variance would indicate that a protein is relatively static and a high variance would indicate that a protein is relatively dynamic. Both the PCC of each interactions and the EV of each gene are calculated across 272 microarray time series experiments in six different data sets from the Saccharomyces Genome Database [56]. Based on gene expression data, Komurov and White [55] also classify proteins of yeast into dynamic proteins (EV > 0.75) and static expressed proteins (EV < 0.25). They find the proteins in the DPIN are mainly dynamic proteins. Although in yeast the proteins of interacting pairs with high PCC tends to have high EV, not all the dynamic proteins are included in the DPIN. To investigate the dynamic features of the human PIN through changes in gene expression, Xia et al. [57] construct a DPIN using the expression profiles on 30 postmortem human brains from subjects ranging from 26 to 106 years old. The DPIN consists of interactions only between protein pairs whose gene expressions are positively correlated or negatively correlated, which is measured by PCC. They use PCC values of 0.4 and −0.4 as cutoffs for positive and negative correlations, respectively. Based on the work of Xia et al. [57], in the aging study of human and fruit fly, Xue et al. [58] extract similar aging-related DPIN to study the dynamic modular structure of PIN during aging based on gene expression profiles. These DPINs are actively related to a specific process such as aging.  C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Zhang et al. [59] construct two coexpressed PINs as a statespecific DPIN to describe different prognosis states of glioma samples. Two groups (representing two prognosis states) of glioma gene expression data are selected as follows: “poor” survival samples and “very long” survival samples. The PCC of each PPI is calculated from gene expression profiles of each group. Thus, for each group, there is a coexpressed PIN, with each interaction assigned a value by the PCC of its linked two genes. Then, by mapping a list of 481 glioma candidate genes derived from the NCBI database to the two coexpressed PINs in the DPIN, a smaller glioma-related DPIN is extracted. By comparing the glioma-related DPIN in the ‘‘poor’’ and ‘‘very long’’ survival subtypes, dynamic subnets in DPIN with differently coexpressed protein interactions (DCPIs) and significantly differently expressed genes (SDEGs) are identified. The difference between the PCCs of each interaction in two groups is calculated by the following equation: DifPCC(x, y) = PCCA (x, y) − PCCB (x, y),

(3)

where PCCA (x, y) is the PCC of gene x and its interaction partner y in expression profiles of group A, while PCCB (x, i) is the PCC of gene x and its interaction partner y in expression profiles of group B. In [59], DCPIs are the PPIs with DifPCC ≥0.5 between the two prognosis states, and SDEGs are identified as genes with differences in gene expression >1.5fold (for upregulated proteins) or 5) might contribute to robustness and other cellular properties for PPIs dynamically regulated in time and space, Han et al. [71] calculate the average of PCC between each hubs and its partners www.proteomics-journal.com

345

Proteomics 2014, 14, 338–352

Figure 4. General processes of DPIN constructed based on coexpression. Table 2. DPIN construction methods based on coexpression alteration

Gene expression data

Static PIN

DPIN

Komurov and White [55] Xia et al. [57]

One group of gene expression profiles One group of gene expression profiles

Consist of PPIs with PCC ≥ 0.65 Consist of PPIs with PCC ≥ 0.4 or PCC≤ −0.4

Zhang et al. [59]

Two groups of gene expression profiles: “poor” survival samples and “very long” survival samples Two groups of time course gene expression profiles from normal and diseased rats

A high-confidence yeast PIN [63]. Integrated human PIN from [64, 65], and [66] A human PIN [67]

Consist of two coexpressed PINs with PCC calculated from gene expression profiles of each groups

Consist of DCPIs and SDEGs

Integrated rat PIN

Consist of two coexpressed PINs at each time point with SCC calculated from gene expression profiles of each groups

For each time point, the dynamic subnet consist of DCPIs and the interactions between SDEGs

Sun et al. [61]

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Dynamic subnet in DPIN

www.proteomics-journal.com

346

J. Wang et al.

Proteomics 2014, 14, 338–352

in a yeast PIN with an expression-profiling compendium of 315 data points for most yeast genes across five different experimental conditions. The average of PCC between a protein x and its partners is computed by: AvgPCC(x) =

nx  1 PCC(x, i), |nx | − 1 i=1

(5)

where nx is the set of neighboring nodes of protein x in the PIN, and |nx | is the number of neighboring nodes of protein x in the PIN. Based on the bimodal distribution of their AvgPCCs, hubs are divided into two categories: party hubs with relatively high AvgPCCs and data hubs with relatively low AvgPCCs. Han et al. also estimate the localization diversity of partners of hubs by using a proteome-wide cellular localization data set [72]. Partners of date hubs have a more diverse spatial distribution than partners of party hubs. Therefore, it is believed that the date hubs bind different proteins at different time points or localizations, while the party hubs bind to their partners simultaneously. Recently, Yu et al. [73] investigate the protein coexpression, and find out that the bottleneck proteins, which are defined as highly connected proteins with a high betweenness centrality, are significantly less well coexpressed with their neighbors than nonbottleneck proteins, implying that expression dynamics is wired into the network topology. These bottleneck proteins are more likely to be essential proteins, and correspond to the dynamic components of the PIN. Lu et al. [74] propose a simple hierarchical clustering algorithm for analyzing the dynamic organization of biological networks by integrating the yeast PPI data, the global subcellular localization data, and the expression profile data.

4.2 Biomarkers In the pursuit of personalized medicine, more demands are put on monitoring molecular entities that can give insight into the specific state of a diseased patient or tumor, which is biomarker discovery [75]. With the use of biomarkers, it can help to understand a multifactorial basis responsible for the pathogenesis of diseases and provide prognostic or diagnostic values. Many traditional statistical methods, based on a single protein or microarray gene expression data alone, or individual genes’ discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance and limited clinic applications. Dynamic network biomarkers can be monitored and evaluated at different stages and time points during the development of diseases [76]. Dynamic network biomarkers do not only show expression level of genes or proteins, but also time dependent strength of interactions between genes or proteins. It has been considered as one of the powerful ways to detect the bifurcation of gene or PPIs, indicating the early change of biomarkers and predicting the occurrence of diseases [76].  C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

To determine whether changes in the organization of the PIN can be used to predict patient outcome, Taylor et al. [77] examine the dynamic structure of the human PIN by combining with genome-wide expression data measured in 79 human tissues [78]. The human PIN consists of the complete PIN from STRING [79], OPHID [67], subsets of interactions mapped from yeast to man [80], and literature-curated interactions [81]. They use an algorithm similar to the one in [71] to classify two types of hubs based on the AvgPCC. To determine the hubs that significantly discriminate between patients who are alive without a disease versus those who died of the disease, the difference between AvgPCCs of each hub in two groups of breast cancer patients is calculated by the following equation: DifPCC(x, i) = PCCA (x, i) − PCCB (x, i), nx 

AveragePCCDif(x) =

(6)

DifPCC(x, i)

i=1

|nx | − 1

,

(7)

where PCCA (x, i) is the PCC of hub x and its interactor i in expression profiles of patient group A, and PCCB (x, i) is the PCC of hub x and its interactor i in expression profiles of patient group B. They find that the hubs that have altered PCC of expression between outcome groups make changes in dynamic network modularity, and can be used to predict patient outcome. To identify dynamic genes and interactions related to glioma prognosis, Zhang et al. [59] extract dynamic subnet consisting of the DCPIs and SDEGs, based on glioma related DPIN. Biomarkers related to glioma prognosis are identified from the dynamic subnet. They find that as lifetime is extended, the expression of MYC gene is enhanced, and interactions between E2F1 and RB1 and between EGFR and p38 become more prevalent. In the study of T2DM [61], the dynamic subnets consisting of DCPIs and SDEGs extracted from each tissue also can be directly applied to the identification of network biomarkers for disease prognosis. By comparing the dynamic subnets of three tissues, they reveal that adipose dysfunctions at an early stage while liver and muscle dysfunction in all periods. Their dynamic subnets fully explore all disease-related interactions and therefore are able to identify disease genes and disease interactions accurately.

4.3 Network medicine Current drug developments that focus on hunting a single highly specific compound that targets a single molecule will fail for complex diseases [82]. The structure and dynamics of PINs govern cell decision processes and the formation of tissue boundaries. Cell behaviors and phenotypes are directly controlled by dynamic of PIN in response to external or internal cues/stimuli. Complex diseases, such as cancer and diabetes, stem from the malfunctions of www.proteomics-journal.com

347

Proteomics 2014, 14, 338–352

such networks. Therefore, for the drug design of complex diseases, Pawson and Linding [82] first propose the concept of “Network Medicine” and suggest that the molecular networks associated with a disease and in particular their dynamics, such as PINs, are powerful drug targets for interventions. Based on the “Network Medicine,” Pawson and Linding [82] suggest two different ways to target the network itself for developing a network medicine after gaining insights into the abnormal networks of diseases. One strategy is to exploit the modular nature of signaling proteins to change the topology and wiring of the network by adding or depleting interactions. The other way aims at extracting control architectures and hierarchies of kinases to suggest combinations of inhibitors to change the topology and information flow within the network. Intensive study of the HIV retrovirus life cycle has led to the identification of critical protein networks and enabled the development of drugs that target these networks [75]. The success of highly active antiretroviral therapy to control HIV is also achieved by targeting the network through which several antiviral drugs (typically three or four) are taken in a combined way. However, “Network Medicine” should be applied in a continuous way. It will be more effective to target the network at different stages of disease progression than target the network at one-time, which can be called “Network Medicine” on dynamic network. “Network Medicine” on dynamic network is to target the network, the topology, and information flow that has changed after the last target. By monitoring the dynamic network response of breast cancer cells to a set of different clinically relevant therapeutic agents, including DNA-damaging reagents and kinase inhibitors, Lee et al. [83] find that the most effective strategy for killing aggressive triple-negative breast cancer cells in vitro and in vivo is a time- and order-dependent combination of drugs. Although the “Network Medicine” on DPIN is at the primer stage, it will be a trend for future medicine.

5

Challenges and discussion

A DPIN that involves temporal and spatial information provides a more comprehensive framework for biological information discovery. Increasing efforts focus on DPIN construction, while many problems need to be handled, including to obtain a comprehensive PIN with low falsepositives and extract dynamic information from gene expression data generated by different techniques. The protein subcellular localizations responding to external stimulus or different conditions have massive untapped potentials to construct DPINs with spatial dynamics. PTMs that have an influence on changing protein function and regulating PPIs are not considered in the DPIN construction yet. Quantitative proteomics provide a direct dynamic information of protein amount, but is vacant in DPINs also. With the development of different DPIN methods, an evalu C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

ation system for measuring the quality of DPINs is badly in need.

5.1 Comprehensive PIN with low false-positives A comprehensive PIN with low false-positives can provide a good scaffold for constructing a high-quality DPIN. Multiple PPIs have been identified by different methods, and most of the currently available data are derived from studies of unmodified proteins. Thus, PPIs that require specific PTMs (PTM-dependent PPIs) would likely have been missed. The available approaches are hard to be applied on identifying these interactions as PTMs can be dynamic and often mediate relatively weak PPIs. The proteins such as 14-3-3 proteins, SH2, and PTB domains only interact with the target proteins when the latter are phosphorylated [6, 7]. Methods, such as a modified Y2H screen [28, 29], which are proposed for detection of PTM-dependent PPIs, will increase the diversity and availability of PPIs. Different methods result in complementary rather than overlapping data. Combination of multi-PPI data sources can reduce the chance of missing TPPIs and PTM-dependent PPIs and is a good way to draw a more comprehensive picture. To extract high-quality PPIs from the integrated PPI set, a score strategy is always used to measure the confidence and quality of PPIs [10]. Subcellular localization also can help to differentiate PPIs between proteins with incompatible subcellular localizations from others.

5.2 Threshold Once the corresponding mRNAs are transcribed, the proteins can be considered as translated. However, the inevitable background noise exists in the gene expression array, which makes the levels of mRNAs suspicious at some contents. Hence, for the methods based on protein presence dynamics, the main challenge is how to effectively determine which proteins are expressed based on noisy expression data. Different threshold approaches will result in different DPINs with different protein presence dynamics. Simply, a global threshold or a local threshold is used to filter systematic noise in gene expression array, as described in, for example, the work of Tang et al. [47] and Hegde et al. [42]. However, some genes with low mRNA levels will be filtered out by a relative high threshold and thus their proteins will not appear in DPINs although they may actually be translated into proteins. Thus, another strategy is to design an individual threshold for each gene based on its own expression characteristics, such as the three-sigma principle used by Wang et al. [50]. The individual threshold of a gene can efficiently differentiate the possible noise expression from gene expression. However, the individual threshold of each gene identifies the presence/active time points for each protein no matter whether the gene is noise or not. Thus, the combination of individual thresholds and a local threshold www.proteomics-journal.com

348

J. Wang et al.

or a global threshold can be more effectively to eliminate the noise expression and noise genes.

5.3 Challenges in RNA-seq With the quick development of the next-generation sequencing technology, RNA-seq [84] is an increasingly popular method to study gene expression. Many time series RNA-seq data sets have been collected to study the dynamic regulations of transcripts [85]. The identification of transcribed mRNA isoforms from the RNA-seq reads alignment and estimation of mRNA isoforms expression levels are basic problems to extract the dynamic information. In eukaryotic cells, a gene has introns and exons, and can be transcribed into different mRNA (mRNA isoforms), therefore different proteins can be translated. Only when the transcribed mRNA isoforms are determined, can we know which proteins will be expressed. Two kind of methods can be used to reveal the transcribed mRNA isoforms, de novo assembly [86], and reads mapping [87]. The ability to identify mRNA isoform of different algorithms are different that have an impact on the dynamic information of protein presence and the estimation of mRNA isoform expression levels. The mRNA expression levels are important to measure coexpression [59] and determine differentially expressed genes and protein activity information. However, the mRNA isofrom expression levels can neither directly be obtained from RNA-seq data, nor be measured by the quantity of reads directly [88]. The usual way to measure mRNA isoform expression levels is to compute the counts of reads mapped to a specific isoform normalized against the isoform length and the sequencing depth. However, it is difficult to be applied because most reads that are mapped to the gene are shared by more than one isoform, and the incomplete isoform annotations also affect the estimation of mRNA isoform expression levels. Generally speaking, to exploit the dynamic information from RNA-seq data, a number of experimental and computational challenges need to be addressed, including the alignment of RNA-Seq reads, identification of mRNA isoforms, and estimation of mRNA isoform expression levels.

5.4 Limited usage of spatial dynamic information The protein localization in cells changes under different cell growth conditions and in response to external stimulus, which can be referred as spatial proteomics. A major challenge in cell biology is to identify the subcellular distribution of proteins within cells and to characterize how protein localization changes with time. Although the dynamic spatial information of proteins is not directly available, protein subcellular localizations responding to external stimulus or different conditions can be obtained, which reflect the spatial dynamics of proteins [89]. Boisvert et al. [90] study dynamic changes in protein localization elicited during the cellular re C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Proteomics 2014, 14, 338–352

sponse to DNA damage based on MS technology. At a steady state, most proteins distribute in one specific subcellular localization (compartment), while a minor subset of proteins equally distribute between two or more compartments. The most common use of subcellular localization is to filter falsepositive interactions, of which the protein pairs are not at the same subcellular localization [10,91]. However, the spatial dynamic information has not been fully used in DPINs, because the protein localization data and gene expression data under the same condition are not easily obtained and thus cannot be used simultaneously. How to make full use of protein spatial dynamic information and overcome the gap between different experiment data is a problem that cannot be ignored in a high-quality DPIN construction.

5.5 Quantitative dynamics and PTMs To illustrate DPIN more clearly and accurately, it requires knowing more than just what proteins are present and which PPIs occur but also requires knowing how much of each protein is present and how it is changing. The existing DPINs do not provide the amounts of proteins that vary with different stages of development, different conditions, and different experiment samples. Changes in protein amounts cannot be simply inferred from gene expression data as mRNA abundance poorly correlates with protein abundance [92]. Gerber et al. present an innovative strategy, protein–AQUA, to measure the absolute levels of proteins and PTM proteins after proteolysis by using stable isotope-labeled peptides and tandem MS [93]. Through quantitative analysis, it reveals that the number of proteins in 14-3-3 family binding to FoxO3A is increasing with the increased target site S253 of FoxO3A phosphorylated by protein kinase B under conditions [2]. Zahng and Neubert employ SILAC to identify phosphotyrosine proteins and their quantitation [94]. Despite limited proteome coverage in most existing studies, MS technology is the only currently available method to systematically detect the changes in protein abundance, PTM, and PPIs, including identification and quantitation of the changes [95–97]. Comparisons are performed between different experiments to calculate how peptides, proteins, and PTM proteins are changing between conditions [98, 99]. Although it is a challenge to integrate protein quantitative dynamics into DPINs, DPINs with the dynamics of quantitative information hold great promise to improve the understanding of psychiatric disorders and identify relevant biomarkers [100]. Since PTMs are dynamics, different forms of a protein generated by PTMs are changing with time and are not represented in DPINs. It will be possible to integrate both the PTMs and the amounts of proteins varying with time into DPIN, when these informations are more easily to be obtained. At that time, the graph theory that is used to represent PIN, DPIN, and other biological network, will be very likely obsolete when considering these dynamic properties of proteins.

www.proteomics-journal.com

Proteomics 2014, 14, 338–352

5.6 Evaluation system of DPIN To evaluate the quality of DPINs constructed by different methods is an important issue, since the conclusions are more convincing when drawn from high-quality DPINs. Comparing the predictions on DPINs with the known biological knowledge has limited evaluation power. On the one hand, the topology properties of DPINs should be taken into account. The scales of subnetworks of DPINs should be in accordance with the existing studies on absolute gene expression. On the other hand, the biological meaning of DPINs takes a great proportion when evaluating the quality of DPINs. At each time point or under a condition, the proteins and their interactions of the subnetwork are not assembled by random and they are involved in certain biological processes, thus the whole subnetwork might be enriched with certain functions. Intuitively, we can take each subnetwork as a whole to analyse its biological functions, which can be termed network ontology analysis. Wang et al. [101] propose a network ontology analysis method which can capture the change of functions not only in dynamic transcription regulatory networks but also in rewiring PINs. Selecting representative biological knowledge to check the compatibility and consistency of different DPINs will be a very convincing way to measure the qualities of DPINs. A DPIN with high quality can help to identify high-confident differential proteins and interactions, which has a promising application prospect. Hence, before seeing how different DPINs work out in practice, a DPIN evaluation system ought to be designed to measure the quality of DPINs. DPINs can provide the temporal, spatial, quantitative information, therefore it is more helpful to identify protein function modules/complexes and biomarkers with the alteration of these dynamic properties. In the pursuit of personalized medicine, clinic diagnose and network medicine based on DPINs will be future hot research areas. This work is supported in part by the National Natural Science Foundation of China under grant nos. 61232001, 61379108, 61370024, and 61370172. The authors have declared no conflict of interests.

6

References [1] Cohen, A. A., Geva-Zatorsky, N., Eden, E., FrenkelMorgenstern, M. et al., Dynamic proteomics of individual cancer cells in response to a drug. Science 2008, 322, 1511–1516. ´ ¨ [2] Rinner, O., Mueller, L. N., Hubalek, M., Muller, M. et al., An integrated mass spectrometric and computational framework for the analysis of protein interaction networks. Nat. Biotechnol. 2007, 25, 345–352. [3] Hegele, A., Kamburov, A., Grossmann, A., Sourlis, C. et al.,

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

349 Dynamic protein-protein interaction wiring of the human spliceosome. Mol. Cell 2012, 45, 567–580. ´ A.-L., Oltvai, Z. N., Network biology: understand[4] Barabasi, ing the cell’s functional organization. Nat. Rev. Genet. 2004, 5, 101–113. [5] Nooren, I. M., Thornton, J. M., Diversity of protein-protein interactions. EMBO J. 2003, 22, 3486–3492. [6] Yaffe, M. B., How do 14-3-3 proteins work?-Gatekeeper phosphorylation and the molecular anvil hypothesis. FEBS Lett. 2002, 513, 53–57. [7] Cattaneo, E., Pelicci, P. G., Emerging roles for SH2/PTBcontaining Shc adaptor proteins in the developing mammalian brain. Trends Neurosci. 1998, 21, 476–481. [8] Li, X., Foley, E. A., Kawashima, S. A., Molloy, K. R. et al., Examining post-translational modification-mediated proteinprotein interactions using a chemical proteomics approach. Protein Science 2013, 22, 287–295. [9] Brooks, C., Li, M., Hu, M., Shi, Y., Gu, W., The p53-Mdm2HAUSP complex is involved in p53 stabilization by HAUSP. Oncogene 2007, 26, 7262–7266. [10] de Lichtenberg, U., Jensen, L. J., Brunak, S., Bork, P., Dynamic complex formation during the yeast cell cycle. Sci. Signal. 2005, 307, 724–727. [11] Jensen, L. J., Jensen, T. S., de Lichtenberg, U., Brunak, S., Bork, P., Co-evolution of transcriptional and posttranslational cell-cycle regulation. Nature 2006, 443, 594–597. [12] Jensen, O. N., Interpreting the protein language using proteomics. Nat. Rev. Mol. Cell Biol. 2006, 7, 391–403. ˚ [13] Berggard, T., Linse, S., James, P., Methods for the detection and analysis of protein-protein interactions. Proteomics 2007, 7, 2833–2842. [14] Perkins, J. R., Diboun, I., Dessailly, B. H., Lees, J. G., Orengo, C., Transient protein-protein interactions: structural, functional, and network properties. Structure 2010, 18, 1233–1243. [15] Wetie, N., Armand, G., Sokolowska, I., Woods, A. G. et al., Investigation of stable and transient protein-protein interactions: Past, present, and future. Proteomics 2013, 13, 538–557. [16] Darie, C. C., Deinhardt, K., Zhang, G., Cardasis, H. S. et al., Identifying transient protein-protein interactions in EphB2 signaling by blue native PAGE and mass spectrometry. Proteomics 2011, 11, 4514–4528. [17] Ozbabacan, S. E. A., Engin, H. B., Gursoy, A., Keskin, O., Transient protein-protein interactions. Protein Eng. Des. Sel. 2011, 24, 635–648. [18] Vaynberg, J., Qin, J., Weak protein-protein interactions as probed by NMR spectroscopy. Trends Biotechnol. 2006, 24, 22–27. [19] Collins, M. O., Choudhary, J. S., Mapping multiprotein complexes by affinity purification and mass spectrometry. Curr. Opin. Biotechnol. 2008, 19, 324–330. [20] Phizicky, E., Bastiaens, P. I., Zhu, H., Snyder, M., Fields, S., Protein analysis on a proteomic scale. Nature 2003, 422, 208–215.

www.proteomics-journal.com

350

J. Wang et al.

[21] Nyfeler, B., Michnick, S. W., Hauri, H.-P., Capturing protein interactions in the secretory pathway of living cells. Proc. Natl. Acad. Sci. USA 2005, 102, 6350–6355. [22] Ohad, N., Shichrur, K., Yalovsky, S., The analysis of proteinprotein interactions in plants by bimolecular fluorescence complementation. Plant Physiol. 2007, 145, 1090–1099. [23] Rich, R. L., Myszka, D. G., Higher-throughput, label-free, real-time molecular interaction analysis. Anal. Biochem. 2007, 361, 1–6. [24] Hu, C.-D., Chinenov, Y., Kerppola, T. K., Visualization of interactions among bZIP and Rel family proteins in living cells using bimolecular fluorescence complementation. Mol. Cell 2002, 9, 789–798. [25] Huang, H., Bader, J. S., Precision and recall estimates for two-hybrid screens. Bioinformatics 2009, 25, 372–378. [26] Vinayagam, A., Stelzl, U., Wanker, E. E., Repeated twohybrid screening detects transient protein-protein interactions. Theor. Chem. Acc 2010, 125, 613–619. [27] Venkatesan, K., Rual, J.-F., Vazquez, A., Stelzl, U. et al., An empirical framework for binary interactome mapping. Nat. Methods 2008, 6, 83–90. [28] Guo, D., Hazbun, T. R., Xu, X.-J., Ng, S.-L. et al., A tethered catalysis, two-hybrid system to identify protein-protein interactions requiring post-translational modifications. Nat. Biotechnol. 2004, 22, 888–892. [29] Li, X., Foley, E. A., Molloy, K. R., Li, Y. et al., Quantitative chemical proteomics approach to identify posttranslational modification-mediated protein-protein interactions. J. Am. Chem. Soc. 2012, 134, 1982–1985. [30] Ideker, T., Sharan, R., Protein networks in disease. Genome Res. 2008, 18, 644–652.

Proteomics 2014, 14, 338–352 tematic analysis of protein complexes. Nature 2002, 415, 141–147. [39] Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D. et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415, 180–183. [40] Mewes, H.-W., Amid, C., Arnold, R., Frishman, D. et al., MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 2004, 32, D41–D44. [41] de Lichtenberg, U., Jensen, L. J., Fausbøll, A., Jensen, T. S. et al., Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics 2005, 21, 1164–1171. [42] Hegde, S. R., Manimaran, P., Mande, S. C., Dynamic changes in protein functional linkage networks revealed by integration with gene expression data. PLoS Comput. Biol. 2008, 4, e1000237. [43] Sherlock, G., Hernandez-Boussard, T., Kasarskis, A., Binkley, G. et al., The Stanford microarray database. Nucleic Acids Res. 2001, 29, 152–155. [44] Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M. et al., Noise in protein expression scales with natural protein abundance. Nat. Genet. 2006, 38, 636–643. [45] Fraser, H. B., Hirsh, A. E., Giaever, G., Kumm, J., Eisen, M. B., Noise minimization in eukaryotic gene expression. PLoS Biol. 2004, 2, e137. [46] Yellaboina, S., Goyal, K., Mande, S. C., Inferring genomewide functional linkages in E. coli by combining improved genome context methods: comparison with highthroughput experimental data. Genome Res. 2007, 17, 527–535.

[31] Lamond, A. I., Uhlen, M., Horning, S., Makarov, A. et al., Advancing cell biology through proteomics in space and time (PROSPECTS). Mol. Cell. Proteomics 2012, 11, 0112.017731.

[47] Tang, X., Wang, J., Liu, B., Li, M. et al., A comparison of the functional modules identified from time course and static PPI network data. BMC Bioinform. 2011, 12, 339–353.

[32] Przytycka, T. M., Singh, M., Slonim, D. K., Toward the dynamic interactome: it’s about time. Brief. Bioinform. 2010, 11, 15–29.

[48] Tu, B. P., Kudlicki, A., Rowicka, M., McKnight, S. L., Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 2005, 310, 1152–1158.

[33] Erler, J. T., Linding, R., Network medicine strikes a blow against breast cancer. Cell 2012, 149, 731–733.

[49] Greenbaum, D., Colangelo, C., Williams, K., Gerstein, M., Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol. 2003, 4, 117–124.

[34] Cho, R. J., Campbell, M. J., Winzeler, E. A., Steinmetz, L. et al., A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 1998, 2, 65–73. [35] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R. et al., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 1998, 9, 3273–3297. [36] Ito, T., Tashiro, K., Muta, S., Ozawa, R. et al., Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc. Natl. Acad. Sci. 2000, 97, 1143–1147. [37] Uetz, P., Giot, L., Cagney, G., Mansfield, T. A. et al., A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000, 403, 623–627. ¨ [38] Gavin, A.-C., Bosche, M., Krause, R., Grandi, P. et al., Functional organization of the yeast proteome by sys-

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

[50] Wang, J., Peng, X., Li, M., Pan, Y., Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics 2013, 13, 301–312. [51] Pramila, T., Wu, W., Miles, S., Noble, W. S., Breeden, L. L., The Forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. Genes Dev. 2006, 20, 2266–2278. [52] Courcelle, J., Khodursky, A., Peter, B., Brown, P. O., Hanawalt, P. C., Comparative gene expression profiles following UV exposure in wild-type and SOS-deficient Escherichia coli. Genetics 2001, 158, 41–64. [53] Salwinski, L., Miller, C. S., Smith, A. J., Pettit, F. K. et al., The database of interacting proteins: 2004 update. Nucleic Acids Res. 2004, 32, D449–D451.

www.proteomics-journal.com

Proteomics 2014, 14, 338–352 [54] Jansen, R., Greenbaum, D., Gerstein, M., Relating wholegenome expression data with protein-protein interactions. Genome Res. 2002, 12, 37–46. [55] Komurov, K., White, M., Revealing static and dynamic modular architecture of the eukaryotic protein interaction network. Mol. Syst. Biol. 2007, 3, 110–120.

351 yeast protein-protein interaction network. Nature 2004, 430, 88–93. [72] Huh, W.-K., Falvo, J. V., Gerke, L. C., Carroll, A. S. et al., Global analysis of protein localization in budding yeast. Nature 2003, 425, 686–691.

[56] Issel-Tarver, L., Christie, K. R., Dolinski, K., Andrada, R. et al., Saccharomyces Genome Database. Methods Enzymol. 2002, 350, 329–346.

[73] Yu, H., Kim, P. M., Sprecher, E., Trifonov, V., Gerstein, M., The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput. Biol. 2007, 3, e59.

[57] Xia, K., Xue, H., Dong, D., Zhu, S. et al., Identification of the proliferation/differentiation switch in the cellular network of multicellular organisms. PLoS Comput. Biol. 2006, 2, e145.

[74] Lu, H., Shi, B., Wu, G., Zhang, Y. et al., Integrated analysis of multiple data sources reveals modular structure of biological networks. Biochem. Biophys. Res. Commun. 2006, 345, 302–309.

[58] Xue, H., Xian, B., Dong, D., Xia, K. et al., A modular network model of aging. Molecular Syst. Biol. 2007, 3, 147–157.

[75] Erler, J. T., Linding, R., Network-based drugs and biomarkers. J. Pathol. 2010, 220, 290–296.

[59] Zhang, X., Yang, H., Gong, B., Jiang, C., Yang, L., Combined gene expression and protein interaction analysis of dynamic modularity in glioma prognosis. J. Neuroncol. 2012, 107, 281–288.

[76] Wang, X., Role of clinical bioinformatics in the development of network-based Biomarkers. J. Clin. Bioinform. 2011, 1, 28–30.

[60] Tusher, V. G., Tibshirani, R., Chu, G., Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. 2001, 98, 5116–5121. [61] Sun, S.-Y., Liu, Z.-P., Zeng, T., Wang, Y., Chen, L., Spatiotemporal analysis of type 2 diabetes mellitus based on differential expression networks. Sci. Rep. 2013, 3, 2268–2280.

[77] Taylor, I. W., Linding, R., Warde-Farley, D., Liu, Y. et al., Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat. Biotechnol. 2009, 27, 199–204. [78] Su, A. I., Wiltshire, T., Batalov, S., Lapp, H. et al., A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 2004, 101, 6062–6067.

[62] Breitkreutz, B.-J., Stark, C., Reguly, T., Boucher, L. et al., The BioGRID interaction database: 2008 update. Nucleic Acids Res. 2008, 36, D637–D640.

[79] Von Mering, C., Jensen, L. J., Kuhn, M., Chaffron, S. et al., STRING 7-recent developments in the integration and prediction of protein interactions. Nucleic Acids Res. 2007, 35, D358–D362.

[63] Bader, J. S., Chaudhuri, A., Rothberg, J. M., Chant, J., Gaining confidence in high-throughput protein interaction networks. Nat. Biotechnol. 2003, 22, 78–85.

[80] Von Mering, C., Krause, R., Snel, B., Cornell, M. et al., Comparative assessment of large-scale data sets of proteinprotein interactions. Nature 2002, 417, 399–403.

[64] Peri, S., Navarro, J. D., Amanchy, R., Kristiansen, T. Z. et al., Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003, 13, 2363–2371.

[81] Zanzoni, A., Montecchi-Palazzi, L., Quondam, M., Ausiello, G. et al., MINT: a Molecular INTeraction database. FEBS Lett. 2002, 513, 135–140.

[65] Rual, J.-F., Venkatesan, K., Hao, T., Hirozane-Kishikawa, T. et al., Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437, 1173–1178. [66] Stelzl, U., Worm, U., Lalowski, M., Haenig, C. et al., A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122, 957–968. [67] Brown, K. R., Jurisica, I., Online predicted human interaction database. Bioinformatics 2005, 21, 2076–2082. [68] Stelzl, U., Wanker, E. E., The value of high quality proteinprotein interaction networks for systems biology. Curr. Opin. Chem. Biol. 2006, 10, 551–558.

[82] Pawson, T., Linding, R., Network medicine. FEBS Lett. 2008, 582, 1266–1270. [83] Lee, M. J., Ye, A. S., Gardino, A. K., Heijink, A. M. et al., Sequential application of anticancer drugs enhances cell death by rewiring apoptotic signaling networks. Cell 2012, 149, 780–794. [84] Wang, Z., Gerstein, M., Snyder, M., RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009, 10, 57–63. [85] Noonan, J. P., Time Series Expression Analyses Using RNAseq: A Statistical Approach. BioMed Res. Int. 2013, 2013. doi: 10.1074/mcp.0112.017731.

[69] Calvano, S. E., Xiao, W., Richards, D. R., Felciano, R. M. et al., A network-based analysis of systemic inflammation in humans. Nature 2005, 437, 1032–1037.

[86] Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z. et al., Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011, 29, 644–652.

[70] Luo, F., Liu, J., Li, J., Discovering conditional co-regulated protein complexes by integrating diverse data sources. BMC Syst. Biol. 2010, 4, S4.

[87] Wang, K., Singh, D., Zeng, Z., Coleman, S. J. et al., MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 2010, 38, e178–e178.

[71] Han, J.-D. J., Bertin, N., Hao, T., Goldberg, D. S. et al., Evidence for dynamically organized modularity in the

[88] Trapnell, C., Williams, B. A., Pertea, G., Mortazavi, A. et al., Transcript assembly and quantification by RNA-Seq reveals

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.proteomics-journal.com

352

J. Wang et al. unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 2010, 28, 511–515.

[89] Altelaar, A. M., Munoz, J., Heck, A. J., Next-generation proteomics: towards an integrative view of proteome dynamics. Nat. Rev. Genet. 2012, 14, 35–48. [90] Boisvert, F.-M., Lam, Y. W., Lamont, D., Lamond, A. I., A quantitative proteomics analysis of subcellular proteome localization and changes induced by DNA damage. Mol. Cell. Proteomics 2010, 9, 457–470. [91] Kumar, A., Agarwal, S., Heyman, J. A., Matson, S. et al., Subcellular localization of the yeast proteome. Genes Dev. 2002, 16, 707–719. [92] de Godoy, L. M., Olsen, J. V., Cox, J., Nielsen, M. L. et al., Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast. Nature 2008, 455, 1251–1254. [93] Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W., Gygi, S. P., Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. 2003, 100, 6940–6945. [94] Zhang, G., Neubert, T. A., Phospho-Proteomics, Springer 2009, 79–92.

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Proteomics 2014, 14, 338–352 [95] Boja, E. S., Rodriguez, H., Mass spectrometry-based targeted quantitative proteomics: Achieving sensitive and reproducible detection of proteins. Proteomics 2012, 12, 1093–1110. [96] Neilson, K. A., Ali, N. A., Muralidharan, S., Mirzaei, M. et al., Less label, more free: Approaches in labelfree quantitative mass spectrometry. Proteomics 2011, 11, 535–553. [97] Gstaiger, M., Aebersold, R., Applying mass spectrometrybased proteomics to genetics, genomics and network biology. Nat. Rev. Genet. 2009, 10, 617–627. [98] Neubert, T. A., Tempst, P., Super-SILAC for tumors and tissues. Nat. Methods 2010, 7, 361–362. [99] Ong, S.-E., Mann, M., Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 2005, 1, 252–262. [100] Filiou, M. D., Turck, C. W., Martins-de-Souza, D., Quantitative proteomics for investigating psychiatric disorders. Proteomics Clin. Appl. 2011, 5, 38–49. [101] Wang, J., Huang, Q., Liu, Z.-P., Wang, Y. et al., NOA: a novel Network Ontology Analysis method. Nucleic Acids Res. 2011, 39, e87-e87.

www.proteomics-journal.com

Dynamic protein interaction network construction and applications.

With more dynamic information available, researchers' attention has recently shifted from static properties to dynamic properties of protein-protein i...
375KB Sizes 0 Downloads 0 Views