Date of Award


Document Type


Degree Name

Master of Science (MS)


Computer Science

First Advisor

Harvey Siy, Ph.D.

Second Advisor

Mahadevan Subramaniam, Ph.D.

Third Advisor

Myoungkyu Song, Ph.D.

Fourth Advisor

William Mahoney, Ph.D.


Software evolves constantly to adapt to changing user needs. As it evolves, it becomes progressively harder to understand due to accumulation of code changes, increasing code size, and the introduction of complex code dependencies. As a result, it becomes harder to maintain, exposing the software to potential bugs and degradation of code quality. High maintenance costs and diminished opportunities for software reusability and portability lead to reduced return on investment, increasing the likelihood of the software product being discarded or replaced. Nevertheless, we believe that there is value in legacy software due to the amount of intellectual efforts that have been invested in it. To extend its value, we utilize the common practice of identifying the pieces of code relevant to a given concern. Identifying relevant code is a manual process and relies on domain and code expertise. This makes it difficult to scale to large and complex code. In this thesis, we propose several automated approaches for capturing the essential code that represents a concern of interest. We utilize dynamic program analysis of execution traces to identify a relevant code subset. Information retrieval techniques are then utilized to improve the accuracy of the capture, refine the process, and verify the results.


A Thesis Presented to the Department of Computer Science and the Faculty of the Graduate College University of Nebraska In Partial Fulfillment of the Requirements for the Degree Masters of Science in Computer Science University of Nebraska at Omaha. Copyright 2017 Chuntao Fu.