Document Type

Conference Proceeding

Publication Date



Correlation networks have been used in biological networks to analyze and model high-throughput biological data, such as gene expression from microarray or RNA-seq assays. Typically in biological network modeling, structures can be mined from these networks that represent biological functions; for example, a cluster of proteins in an interactome can represent a protein complex. In correlation networks built from high-throughput gene expression data, it has often been speculated or even assumed that clusters represent sets of genes that are coregulated. This research aims to validate this concept using network systems biology and data mining by identification of correlation network clusters via multiple clustering approaches and cross-validation of regulatory elements in these clusters via motif finding software. The results show that the majority (81- 100%) of genes in any given cluster will share at least one predicted transcription factor binding site. With this in mind, new regulatory relationships can be proposed using known transcription factors and their binding sites by integrating regulatory information and the network model itself.


2013 IEEE 13th International Conference on Data Mining Workshops

© 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.