Date of Award

10-2011

Document Type

Thesis

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Dr. Parvathi Chundi

Second Advisor

Dr. Zhenyuan Wang

Third Advisor

Dr. Mansour Zand

Abstract

Software vulnerabilities allow an attacker to reduce a system's Confidentiality, Availability, and Integrity by exposing information, executing malicious code, and undermine system functionalities that contribute to the overall system purpose and need. With new vulnerabilities discovered everyday in a variety of applications and user environments, a systematic study of their characteristics is a subject of immediate need for the following reasons:

  • The high rate in which information about past and new vulnerabilities are accumulated makes it difficult to absorb and comprehend.
  • Rather than learning from past mistakes, similar types of vulnerabilities are observed repeatedly.
  • As the scale and complexity of current software grows, better mental models will be required for developers to sense the possibility for the occurrence of vulnerabilities.

While the software development community has put a significant effort to capture the artifacts related to a discovered vulnerability in organized repositories, much of this information is not amenable to meaningful analysis and requires a deep and manual inspection. ln the software assurance community a body of knowledge that provides an enumeration of common weaknesses has been developed, but it is complicated and not readily usable for the study of vulnerabilities in specific projects and user environments. Also the discovered vulnerabilities from different projects are collected in various databases with general metadata such as dates, person, and natural language descriptions but without the links to other relevant knowledge, they are hard to be utilized for the purpose of understanding vulnerabilities.

This research combines the information sources from these communities in a way that facilitates the study of vulnerabilities recorded in large software repositories. We introduce the notion of Semantic Template to integrate the scattered information relevant to understand and discover vulnerabilities. We evaluate the use of semantic templates by applying it to analyze and annotate vulnerabilities recorded in software repositories from the Apache Web Server project. We refer to software repositories in a general sense that includes source code, version control data, bug reports, developer mailing lists and project development websites. We derive semantic templates from community standards such as the Common Weaknesses Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE). We rely on standards in order to facilitate the adoption, sharing and interoperability of semantic templates.

This research contributes a novel theory and corresponding mechanisms for the study of vulnerabilities in large software projects. To support these claims, we discuss our experiences and present our findings from the Apache Web Server project.

Comments

A DISSERTATION presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy. Copyright 2011 Yan Wu.

COinS