A new approach for analyzing genomic sequences by integrating assembly-based and assembly-free algorithms

Presenter Information

Vi DamFollow

Advisor Information

Hesham Ali

Location

UNO Criss Library, Room 225

Presentation Type

Oral Presentation

Start Date

2-3-2018 3:00 PM

End Date

2-3-2018 3:15 PM

Abstract

As modern technologies continue to advance rapidly, specifically in producing tremendous volume of biological data, the applications of computational biology and bioinformatics tools have significant impact on advancing various aspects of biomedical research. With Next Generation Sequencing (NGS) instruments producing large number of genomic sequences in the form of short reads, many sequence assemblers have been developed to assemble such reads into larger fragments more suitable for advanced analysis. Many recent studies have shown that current assemblers have major limitations in producing accurate genomic sequences which in turn impact the quality of sequence analysis. In particular, the lack of robustness and accuracy limit the impact of bioinformatics in extracting meaningful information from the available raw data. In this study, we conduct a comparative study of popular assemblers and analyze their performance in comparing genomic sequences. We also study several assembly-free approaches to assess their ability to compare sequences in their short-read format. Further, we propose a hybrid approach that incorporates results obtained from partially assembled reads as the input for assembly-free algorithms to enhance the overall quality of output genomic sequences. Preliminary results show that in employing the hybrid approach, assembly-free methods are able to overcome the limitations of current assemblers, especially if partial assembly is used as a preprocessing step.

This document is currently not available here.

COinS
 
Mar 2nd, 3:00 PM Mar 2nd, 3:15 PM

A new approach for analyzing genomic sequences by integrating assembly-based and assembly-free algorithms

UNO Criss Library, Room 225

As modern technologies continue to advance rapidly, specifically in producing tremendous volume of biological data, the applications of computational biology and bioinformatics tools have significant impact on advancing various aspects of biomedical research. With Next Generation Sequencing (NGS) instruments producing large number of genomic sequences in the form of short reads, many sequence assemblers have been developed to assemble such reads into larger fragments more suitable for advanced analysis. Many recent studies have shown that current assemblers have major limitations in producing accurate genomic sequences which in turn impact the quality of sequence analysis. In particular, the lack of robustness and accuracy limit the impact of bioinformatics in extracting meaningful information from the available raw data. In this study, we conduct a comparative study of popular assemblers and analyze their performance in comparing genomic sequences. We also study several assembly-free approaches to assess their ability to compare sequences in their short-read format. Further, we propose a hybrid approach that incorporates results obtained from partially assembled reads as the input for assembly-free algorithms to enhance the overall quality of output genomic sequences. Preliminary results show that in employing the hybrid approach, assembly-free methods are able to overcome the limitations of current assemblers, especially if partial assembly is used as a preprocessing step.