Date of Award

11-2013

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Dr. Sanjukta Bhowmick

Second Advisor

Dr.Robin A. Gandhi

Third Advisor

Dr.Qiuming Zhu

Abstract

A network is said to exhibit community structure if the nodes of the network can be easily grouped into groups of nodes, such that each group is densely connected internally but sparsely connected with other groups. Most real world networks exhibit community structure.

A popular technique for detecting communities is based on computing the modularity of the network. Modularity reflects how well the vertices in a group are connected as opposed to being randomly connected. We propose a parallel algorithm for detecting modularity in large networks.

However, all modularity based algorithms for detecting community structure are affected by the order in which the vertices in the network are processed. Therefore, detecting communities in real world graphs becomes increasingly difficult. We introduce the concept of stable community, that is, a group of vertices that are always partitioned to the same community independent of the vertex perturbations to the input. We develop a preprocessing step that identifies stable communities and empirically show that the number of stable communities in a network affects the range of modularity values obtained. In particular, stable communities can also help determine strong communities in the network.

Modularity is a widely accepted metric for measuring the quality of a partition identified by various community detection algorithms. However, a growing number of researchers have started to explore the limitations of modularity maximization such as resolution limit, degeneracy of solutions and asymptotic growth of the modularity value for detecting communities. In order to address these issues we propose a novel vertex-level metric called permanence. We show that our metric permanence as compared to other standard metrics such as modularity, conductance and cut-ratio performs as a better community scoring function for evaluating the detected community structures from both synthetic networks and real-world networks. We demonstarte that maximizing permanence results in communities that match the ground-truth structure of networks more accurately than modularity based and other approaches. Finally, we demonstrate how maximizing permanence overcomes limitations associated with modularity maximization.

Comments

A Thesis Presented to the Department of Computer Science and the Faculty of the Graduate College University of Nebraska In Partial Fulfillment Of the Requirements for the Degree Master of Science University of Nebraska at Omaha. Copyright 2013 Sriram Srinivasan.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

COinS