Designing a modular and customizable snakemake pipeline specialized for 16S data analysis

Presenter Information

Chris SchinzelFollow

Presenter Type

UNO Undergraduate Student

Major/Field of Study

Biology

Other

Biology

Advisor Information

Jonathan Clayton

Location

CEC RM #201/205/209

Presentation Type

Poster

Poster Size

48"x36"

Start Date

22-3-2024 1:00 PM

End Date

22-3-2024 2:15 PM

Abstract

Designing a modular and customizable snakemake pipeline specialized for 16S data analysis

Chris H. Schinzela, Jordan B. Hernandeza,b,c, Katherine M. Cooperd, Paul A. Ayayeea, Jonathan B. Claytona,b,c,e,f,g

aDepartment of Biology, University of Nebraska at Omaha, Omaha, NE, USA

bNebraska Food for Health Center, University of Nebraska-Lincoln, Lincoln, NE, USA

cCallitrichid Research Center, University of Nebraska at Omaha, Omaha, NE, USA

dSchool of Interdisciplinary Informatics, College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE, USA

eDepartment of Food Science and Technology, University of Nebraska-Lincoln, Lincoln, NE, USA

fDepartment of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE, USA

gPrimate Microbiome Project, University of Nebraska-Lincoln, Lincoln, NE, USA

16S rRNA gene sequencing has become the de-facto method for easily studying gut microbial communities. As a result, a number of innovative software packages, such as QIIME2, DADA2, and Mothur, have put their own spin on the analysis pipeline and boast unique features over each of the other packages. The result of this has made 16S analysis a complicated subject to approach, particularly for those who are new to the field. In addition, more experienced microbial researchers may have trouble adding customizability to their data analyses. To combat this, an automated data pipeline called ampliConda, has been developed by our lab group with the goal of providing complete control over the 16S analysis process to the end user. This pipeline has been developed using Snakemake, a workflow manager aimed at automating data analysis with an emphasis on reproducibility and modularization. Additionally, Snakemake is incredibly effective at saving computing power and streamlining the analysis process, tracing the end goal of the pipeline back to the beginning and skipping unnecessary steps. Currently, both the QIIME2 and DADA2 (R-based) packages have been integrated into this pipeline, which has allowed the analysis framework (reference database, read trimming, etc.) to be customizable. Our next step is to further integrate 16S packages into one space where the end user can completely customize and automate a pipeline entirely of their choosing. The long-term goal of ampliConda is to become a hub where scientists of all experience-levels are able to perform customized analyses using 16S sequencing data.

This document is currently not available here.

COinS
 
Mar 22nd, 1:00 PM Mar 22nd, 2:15 PM

Designing a modular and customizable snakemake pipeline specialized for 16S data analysis

CEC RM #201/205/209

Designing a modular and customizable snakemake pipeline specialized for 16S data analysis

Chris H. Schinzela, Jordan B. Hernandeza,b,c, Katherine M. Cooperd, Paul A. Ayayeea, Jonathan B. Claytona,b,c,e,f,g

aDepartment of Biology, University of Nebraska at Omaha, Omaha, NE, USA

bNebraska Food for Health Center, University of Nebraska-Lincoln, Lincoln, NE, USA

cCallitrichid Research Center, University of Nebraska at Omaha, Omaha, NE, USA

dSchool of Interdisciplinary Informatics, College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE, USA

eDepartment of Food Science and Technology, University of Nebraska-Lincoln, Lincoln, NE, USA

fDepartment of Pathology and Microbiology, University of Nebraska Medical Center, Omaha, NE, USA

gPrimate Microbiome Project, University of Nebraska-Lincoln, Lincoln, NE, USA

16S rRNA gene sequencing has become the de-facto method for easily studying gut microbial communities. As a result, a number of innovative software packages, such as QIIME2, DADA2, and Mothur, have put their own spin on the analysis pipeline and boast unique features over each of the other packages. The result of this has made 16S analysis a complicated subject to approach, particularly for those who are new to the field. In addition, more experienced microbial researchers may have trouble adding customizability to their data analyses. To combat this, an automated data pipeline called ampliConda, has been developed by our lab group with the goal of providing complete control over the 16S analysis process to the end user. This pipeline has been developed using Snakemake, a workflow manager aimed at automating data analysis with an emphasis on reproducibility and modularization. Additionally, Snakemake is incredibly effective at saving computing power and streamlining the analysis process, tracing the end goal of the pipeline back to the beginning and skipping unnecessary steps. Currently, both the QIIME2 and DADA2 (R-based) packages have been integrated into this pipeline, which has allowed the analysis framework (reference database, read trimming, etc.) to be customizable. Our next step is to further integrate 16S packages into one space where the end user can completely customize and automate a pipeline entirely of their choosing. The long-term goal of ampliConda is to become a hub where scientists of all experience-levels are able to perform customized analyses using 16S sequencing data.