A novel multilayer graph model for building smart assemblers and efficiently extracting information from next generation sequencing reads
Advisor Information
Hesham Ali
Location
UNO Criss Library, Room 231
Presentation Type
Oral Presentation
Start Date
7-3-2014 3:00 PM
End Date
7-3-2014 3:15 PM
Abstract
The rapid advancement of Next Generation Sequencing (NGS) technologies has inspired the development of numerous read assembly and analysis tools. A wide variety of assemblers and read analysis tools rely on an overlap graph as their foundational model. However, a single graph modeling approach can only capture one view of the read overlap relationships in a NGS dataset and most current graph-based assembly and analysis tools model only localized overlap relationships between individual reads. This fine-grained approach may also miss global relationships between subsets of reads in the dataset such as repeats or shared regions between multiple genomes in metagenomics applications. To address these issues, we have developed a graph theoretic modeling approach that is able to capture multiple snapshots of local and global read relationships across a spectrum of granularity. Unlike previous methods that rely on a single graph model, the proposed approach constructs a series of graphs that can model the reads from localized relationships between individual reads to global relationships between subsets of reads within the dataset. Using the multilayer model, we developed data analysis algorithms that integrate various graphs in the spectrum to capture different types of relationships among the input reads and efficiently extract useful information that can be used for recognition and classification purposes. The implementation of this approach in High Performance Computing (HPC) environments will provide a robust graph-modeling platform for domain-specific, flexible assembly tactics resulting in improved assembly and analytics tools that are scalable to the increasing demands of biomedical researchers.
A novel multilayer graph model for building smart assemblers and efficiently extracting information from next generation sequencing reads
UNO Criss Library, Room 231
The rapid advancement of Next Generation Sequencing (NGS) technologies has inspired the development of numerous read assembly and analysis tools. A wide variety of assemblers and read analysis tools rely on an overlap graph as their foundational model. However, a single graph modeling approach can only capture one view of the read overlap relationships in a NGS dataset and most current graph-based assembly and analysis tools model only localized overlap relationships between individual reads. This fine-grained approach may also miss global relationships between subsets of reads in the dataset such as repeats or shared regions between multiple genomes in metagenomics applications. To address these issues, we have developed a graph theoretic modeling approach that is able to capture multiple snapshots of local and global read relationships across a spectrum of granularity. Unlike previous methods that rely on a single graph model, the proposed approach constructs a series of graphs that can model the reads from localized relationships between individual reads to global relationships between subsets of reads within the dataset. Using the multilayer model, we developed data analysis algorithms that integrate various graphs in the spectrum to capture different types of relationships among the input reads and efficiently extract useful information that can be used for recognition and classification purposes. The implementation of this approach in High Performance Computing (HPC) environments will provide a robust graph-modeling platform for domain-specific, flexible assembly tactics resulting in improved assembly and analytics tools that are scalable to the increasing demands of biomedical researchers.