The content in this collection is available only to Washington University in St. Louis users per the request of the Office of Undergraduate Research. If you have questions, please contact .

Document Type

Feature Article

Publication Date

Spring 9-1-2014

Publication Title

Washington University Undergraduate Research Digest: WUURD 9(2)


Faculty Mentor: Hesham Ali, University of Nebraska, Omaha

Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein- protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those proper- ties to biological significance. Our results showed that identifying and cross- referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discov- ered a key relationship between network properties and how sparse/dense a network is—a property we described as “sparseness”. Overall, structurally important nodes were found to have significant biological relevance.

From the Washington University Undergraduate Research Digest: WUURD, Volume 9, Issue 2, Spring 2014. Published by the Office of Undergraduate Research, Joy Zalis Kiefer Director of Undergraduate Research and Assistant Dean in the College of Arts & Sciences; Kristin Sobotka, Editor.

Another version of this work has been published in ICDM Workshops 2013: 343-348.


Copyright: All work is copyrighted by the authors and permission to use this work must be granted. The Office of Undergraduate Research can assist in contacting an author.

Off-campus Download