We will be presenting at GTC's Bioinformatics & Pharmacogenomics: Managing and Analyzing Big Data Conference in San Diego, CA.
Vas Vasiliadis will be presenting at 4:25 p.m. on June 19 on "Scalable Data Management Infrastructure for Reproducible Life Science Research".
Genomics research teams in academia and industry are increasingly limited at all stages of their work by large and unwieldy datasets, poor integration between the computing facilities they use for analysis, and difficulty in sharing analysis results with their customers and collaborators. The seemingly mundane tasks of moving and sharing data are becoming a significant bottleneck to scaling research workflows. Further, research teams are increasingly distributed, presenting many technical barriers to collaboration. The presentation will review issues with current approaches and describe emerging best practices for managing genomics data through its lifecycle. Globus Genomics, an end-to-end NGS analysis service with highly scalable research data management capabilities, will be used as an exemplar. The system combines data management capabilities of Globus and the flexibility of the Galaxy framework with the scale of Amazon Web Services to provide researchers and core labs at various universities with an integrated solution to meet their rapidly growing genomics analysis needs. The emphasis is on providing the researcher with a high degree of flexibility to inspect, customize, and configure NGS analysis tools and workflows, and share findings with collaborators. Brief cases studies illustrating the impact of this technology on genomics labs and NGS cores will also be presented. The objective is to provide an overview of the relevant technologies, to better inform decisions regarding data management infrastructure. Practitioners will gain an understanding of the technical and economic tradeoffs they must consider when developing and deploying such infrastructure.