NHRIHelpContact flydpi


NEED HELP ? 

Table of Contents

           Quick start on fly-DPI

           1. General search

           2. Protein network map

           3. Ping-pong search

      Data source and Statistic model implemented     

Statistic Model Implemented

Data Source

Protein-protein interaction data of D. melanogaster proteomics used in the Fly-DPI were obtained from three recently published high-throughput two-hybrid experiments (Giot et al., 2003, Stanyon et al., 2004 and Formstecher et al., 2005 respectively) and the collection of some other experiments in FlyBase Gene Annotation reports. For the total and high-confidence sets, 23,802 and 2776 non-redundant interactions, derived from 15,444 and 1850 proteins, respectively, were used as our starting datasets.

Small world protein interaction networks would give us more confidence in comparison to large scale networks. According to condidence scoring of two-hybrid, we have two pools of network partners. The pool with all three datasets were mentioned above. The pool with high confidence was gathered from the high-confidence score maps by Giot et al. (2101 unique interactions involving 1339 proteins) and Formstecher et al. (675 unique intreactions involving 511 proteins).

Annotation data constructed as follows

UniPort Knowledgebase Release 6.4 (08-Nov-2005)

Flybase Release 4.2 KEGG (10-JUL-2005)

GO Revision: 1.303 (19-Dec-2005)

Integr8 Release 21 (11-JUL-05)

Prediction of Protein Interaction

In order to access the statistical model of the protein network, we extract the chance of co-current of two domains in an interaction from the protein interaction dataset. In Hybrid Model, association measure is applied to be the initial values in EM iterations. Using Association measures as informative priors also resolves the local MLE problem which is always a major issue while choosing non-informative priors in use of Expectation-Maximization (EM) Algorithm. Since the observed frequency of a domain pair counted is counted by association measure, Maximum Likelihood Estimation (MLE) method with EM algorithm is utilized to botain the estimates of parameters. The maximumMaximunlikelihood functionMethods provides estimated probabilitiesfor a protein network with the knowledge as each pair of domain interaction. By selecting an appropriate threshold as a standard, every the probability is dichotomized as predictors of protein interactions.

In short, our Hybrid Model, as a chimera of MLE and Association, outperforms Association and MLE respectively in the terms of prediction power improvement and computation minimization.

Result and Evaluation

From a limited experimental data (23,802 and 2,776 interactions, represented all and high confidence set, respectively) available from D. melanogaster protein interactome, our prediction system proposed an integrated approach, Hybrid Model, by grabbing the association measures as priors to form an integrated graph in protein network. Our Hybrid Model successfully computed the MLE on Drosophila melanogaster interaction data and gives a better prediction (0.45 and 0.25, represented all and high confidence set, respectively) than using Association (0.2) only.

In the development of our prediction system, not only single- but also multiple-domain methods were tested on the Association and Hybrid Model.

Back to top