Data volumes are exploding, and the need to efficiently store and share data quickly, reliably and securely—along with making the data discoverable—is increasingly important. Deliberate planning and execution are necessary to properly collect, curate, manage, protect and disseminate the data that are the lifeblood of the modern research enterprise.

Computational scientific research involves massive datasets created by today’s cutting-edge instruments and experiments — telescopes, particle accelerators, sensor networks and molecular simulations. Scientific software used to process these massive data sets and extract discoveries from experimental data is typically made up of tens to thousands of smaller functions, blocks of code that handle individual jobs in the long pipeline of data analysis.