Procedia Computer Science
Effcient analysis of complex networks is often a challenging task due to its large size and the noise inherent in the system. One popular method of overcoming this problem is through graph sampling, that is extracting a representative subgraph from the larger network. The accuracy of the sample is validated by comparing the combinatorial properties of the subgraph and the original network. However, there has been little study in comparing networks based on the applications that they represent. Furthermore, sampling methods are generally applied agnostically, without mapping to the requirements of the underlying analysis. In this paper,we introduce a parallel graph sampling algorithm focusing on gene correlation networks. Densely connected subgraphs indicate important functional units of gene products. In our sampling algorithm, we emphasize maintaining highly connected regions of the network through parallel sampling based on extracting the maximal chordal subgraph of the network. We validate our methods by comparing both combinatorial properties and functional units of the subgraphs and larger networks. Our results show that even with significant reduction of the network (on average 20% to 40%), we obtain reliable samplings and many of the relevant combinatorial and functional properties are retained in the subgraphs.
Cooper, Kathryn Dempsey; Duraisamy, Kanimathi; Ali, Hesham; and Bhowmick, Sanjukta, "A Parallel Graph Sampling Algorithm for Analyzing Gene Correlation Networks" (2011). Interdisciplinary Informatics Faculty Publications. 25.