A new data platform for the biological sciences promises to improve the face of scientific analysis and publication.
Six years in the making, the U.S. Department of Energy’s (DOE) Knowledgebase (KBase) program offers the most updated system for recording experimental methods, collaborating with colleagues and performing every step of biological analysis through one free, open source.
Though KBase has been available to the public since February 2015, Nature Biotechnology unveiled the program’s official publication in July. It captures the central pillars of the KBase system in a way that makes its complex functioning highly accessible to everyone.
With the initial goal of developing a type of public research “Dropbox,” the KBase team, comprised of researchers from DOE’s Argonne National Laboratory, Lawrence Berkeley National Laboratory, Oak Ridge National Laboratory, Brookhaven National Laboratory as well as the Cold Spring Harbor Laboratory have produced a multifaceted system that will change how science is recorded and shared.
“We need a computational framework that shares data in the same way that people use the data,” said Christopher Henry, co-principal investigator on the KBase project and computational biologist in Argonne’s Data Science and Learning division.
Putting the “public” back in publication
Among the KBase software features that the publication highlights, transparency and reproducibility stand out. KBase tracks and records each step of an experiment to save time. Instead of depending upon one’s memory to record a consistently reproducible experiment after the fact, KBase acts as a detailed note-taker in real time.
“In science, a lot of work goes into trying to reproduce what someone somewhere has already done,” said Daniel Murphy-Olson, an IT leader in Argonne’s Computing, Environment and Life Sciences directorate. “One of KBase’s strengths is that it is very easy to see what others have done and very easy to swap out elements related to what you want to do.”
When a “narrative,” or experiment report, goes public, scientists can apply its methodological outline (workflow) to their own project. Combining results makes significant outcomes more rapidly attainable, as it offers researchers a resource for advice if their own work stalls. KBase also gathers all scientific data related to a particular study in one place (facilitated by the Globus data management and file transfer system), making it easy for future researchers to track this data down and build upon existing work. In a way, KBase acts as a free, detailed scientific “cookbook.”
Various analytical methods under one roof
With its simple model of single-purpose applications, KBase shares some similarities with the multinational technology company, Apple. Just as Apple’s technology transforms a phone into a calculator, a navigator and a bank with the touch of a button, KBase provides applications and a free “app store” for researchers to choose the right services.
Currently, KBase offers at least 160 free apps, each specializing in different areas of biological data analysis, from meta-genome assembly and annotation to RNA-sequence processing.
“KBase is the only current program that allows you to perform all of these tasks and more within one platform,” said José Pedro Lopes Faria, postdoctoral appointee in Argonne’s Mathematics and Computer Science division.
As of September 2017, KBase had 3,000 members and 5,000 experimental narratives. These narratives house all of the research from one experiment. Within each narrative, computational analysis gets divided into chapters, or subsections, with a table of contents, making it easy to find specific information. KBase modernizes the classic scientific lab notebook, giving it a more organized structure and ensuring that all progress is managed and protected.
KBase is quickly gaining users from fields beyond science. Developers continue to create apps within the KBase framework, which have drawn educators, for example, who find the tool effective for ensuring academic integrity. Startups, too, have taken advantage of the free analytics tools.
A teachable layout
Though KBase is complex, it is not difficult to learn. After achieving a firm knowledge of the programs underlying platform, users will have a handle on commanding the system to do the rest of the computational work for them.
“Once you get the ropes around KBase, you won’t have to learn all of the details of each app,” said Faria.
KBase product teams travel throughout the country teaching students, scientists and professors from a wide range of biology-related fields to apply KBase to their work most effectively. During a two-day training course, individuals can learn how KBase enhances publication usability, simplifies access to computational tools and how they can create apps on their own.
A spirit of collaboration
KBase also allows scientists to make their work public and share their ideas. This may strengthen the lines of communication between scientists and could vastly expedite an individual’s experimental efficiency with its readily accessible methodological wisdom.
“It provides context. That alone is very valuable for starting these conversations,” explained Henry, when describing the value that KBase brings in uniting scientific fields.
By connecting researchers who normally would be separated by specialty and given little reason to interact, KBase enables scientific conversation to extend beyond the walls of one’s traditional focus, unveiling unfamiliar but valid connections between scientific fields and discoveries.
“Our hope is that KBase will become the première system for DOE data analysis,” said Murphy-Olson. “If it continues to grow, it is hard to say right now where these advances might end.”
Over time, the KBase team plans to enable the platform to develop its own scientific hypotheses by relating data sets even before researchers make the connections. These advancements will likely appear through “data discovery,” an upcoming feature similar to social media feeds. Similarly, the team hopes to continue opening doors for greater scientific conversations. After all, it is conversations like these that serve as a foundation upon which scientific advancement depends.
The publication, “KBase: The United States Department of Energy Systems Biology Knowledgebase” is published in the July release of Nature Biotechnology.
The research was funded by DOE’s Knowledgebase Program within the Office of Biological and Environmental Research’s Genomic Science program.