BioPerl: The fastest draw in the west?

Advisor Information

Dhundy Bastola

Location

Milo Bail Student Center Dodge Room B

Presentation Type

Oral Presentation

Start Date

8-3-2013 9:00 AM

End Date

8-3-2013 9:15 AM

Abstract

BioPerl is an open source programing language designed to aid in biological research. Biological data sets have grown exponentially over the past several years. BioPerl ‘s tools allow scientists to manage and effectively analyze these data sets. Previous work here at UNO[1] has documented that BioPerl remains the most popular bioinformatics language. Looking to expand on this, we wished to evaluate the responsiveness of BioPerl to the needs of its community. To accomplish this, we identified major revisions in the source code and then set about data mining the official mailing list. This mailing list serves as the principal interface for communication about BioPerl. A metric could be devised to compare the responsiveness of two languages if we could isolate revision discussion timelines. Our initial results found that BioPerl’s mailing lists are too broad in scope to effectively isolate out these conversation threads. We then tested our mining model by evaluating the next two most popular languages: BioJava and BioPython. The major difference between these two languages and BioPerl is that they have a separate developer mailing list. Using the developer mailing list, we were able to identify the lines of communication leading to large source code changes. This led us to conclude that our method could be effective in modeling responsiveness of developers, given a separate developer channel. A developer’s channel may be an excellent addition for BioPerl, allowing it to serve as a guide for previous changes, and highlighting what needs attention.

This document is currently not available here.

COinS
 
Mar 8th, 9:00 AM Mar 8th, 9:15 AM

BioPerl: The fastest draw in the west?

Milo Bail Student Center Dodge Room B

BioPerl is an open source programing language designed to aid in biological research. Biological data sets have grown exponentially over the past several years. BioPerl ‘s tools allow scientists to manage and effectively analyze these data sets. Previous work here at UNO[1] has documented that BioPerl remains the most popular bioinformatics language. Looking to expand on this, we wished to evaluate the responsiveness of BioPerl to the needs of its community. To accomplish this, we identified major revisions in the source code and then set about data mining the official mailing list. This mailing list serves as the principal interface for communication about BioPerl. A metric could be devised to compare the responsiveness of two languages if we could isolate revision discussion timelines. Our initial results found that BioPerl’s mailing lists are too broad in scope to effectively isolate out these conversation threads. We then tested our mining model by evaluating the next two most popular languages: BioJava and BioPython. The major difference between these two languages and BioPerl is that they have a separate developer mailing list. Using the developer mailing list, we were able to identify the lines of communication leading to large source code changes. This led us to conclude that our method could be effective in modeling responsiveness of developers, given a separate developer channel. A developer’s channel may be an excellent addition for BioPerl, allowing it to serve as a guide for previous changes, and highlighting what needs attention.