In 2021, the Argonne Leadership Computing Facility (ALCF) is planning to deploy Aurora A21, a new Intel-Cray system, slated to be the first exascale supercomputer in the United States. Aurora will be equipped with advanced capabilities for modeling and simulation, data science, and machine learning, which will allow scientists to tackle much larger and more complex problems than are possible today.
To prepare for the exascale era, Argonne researchers are exploring new services and frameworks that will improve collaboration across scientific communities, eliminate barriers to productively using next-generation systems like Aurora, and integrate with user workflows to produce seamless, user-friendly environments.
Ian Foster, director of Argonne’s Data Science and Learning Division, is leading a team of researchers from Argonne and Globus to develop a pilot of a lab-wide service that will make it easier to access, share, analyze, and reuse large-scale datasets. The service leverages Globus, which is a research data management platform from the University of Chicago, Argonne’s Petrel system, which is a storage resource that allows researchers to easily store and share large-scale datasets with collaborators, and Jupyter Notebooks, which is an open source web application that allows researchers to create and share documents that contain live code, equations, and visualizations.
Ultimately, the service is aimed at enabling more effective capture and organization of data; discovery and interrogation of relevant data; and association of machine learning models with data collections for improved reproducibility and simpler deployment at scale.
“Our motivation,” Foster explains, “is to create rich, scalable data services so people don’t just come to the ALCF for simulation but for simulation and data-centric activities.”