PEARC22 logo
July 10, 2022 (All day) to July 14, 2022 (All day)

Boston, MA

Globus is excited to exhibit once again at this year's Practice Experience in Advanced Research Computing (PEARC) conference.

Visit the Globus exhibit

  • Learn what's new at Globus
  • Find out how a Globus subscription can benefit you and your organization
  • Talk to our experts about your research data management needs

Attend a Globus tutorial

Developing Robust and Scalable Next Generation Workflow Applications and Systems

  • Presenter: Kyle Chard, University of Chicago and Argonne National Laboratory
  • Date: Monday, July 11
  • Time: 8:30 a.m. -11:30 a.m. ET
  • Location: Clarendon

Workflow applications are critical to scientific discovery. Technology trends and the convergence of traditional High Performance Computing (HPC) with new simulation, analysis, and machine learning (ML) approaches provides unprecedented opportunities. Traditional approaches to workflow applications and systems development have scalability and robustness limits. The ExaWorks project is building a robust workflows SDK with robust and high-performance technologies as well as well-defined and scalable component interfaces which can be leveraged by new and existing workflow applications and systems. This tutorial will present the ExaWorks SDK, and its constituent components: Flux, Parsl, RADICAL-Cybertools (RCT), and Swift/T. These components are widely used, and available tools for developing workflow applications. This tutorial will outline modern workflow motifs on HPC platforms (e.g., ensemble campaigns, ML-in-the-loop), illustrate science examples of these motifs, and discuss solutions using the ExaWorks SDK. One third of the tutorial is dedicated to presentations from experts, and two thirds are dedicated to hands-on exercises. Attendees will gain practical knowledge to develop best workflow practices to manage large-scale campaigns on the largest supercomputers. At the end of the tutorial, they will be able to apply these tools and techniques to their advanced workflows with minimal programming effort.

Scalable Automation of Data Management Tasks

  • Presenters: Rachana Ananthakrishnan, University of Chicago (Globus)
  • Date: Monday, July 11
  • Time: 8:30-5 p.m. ET
  • Location: Studio 1 Room

Globus is widely used among the PEARC community for reliable data transfer, but a growing number of computationally intensive research activities require commensurate large-scale data management. A common use case is that of high-resolution imaging instruments, e.g., cryoEM and synchrotron beamlines, that require automation of data flows to increase throughput and researcher productivity, as well as to ensure the instrument remains highly utilized. Globus platform services (including Globus Flows and Globus Auth), combined with data distribution platforms that use the Modern Research Data Portal design pattern[1], can greatly simplify the development and execution of automated data management tasks in this context.We will describe how Globus platform services facilitate the construction of automated flows using our work with multiple instrument facilities as exemplars. Attendees will have the opportunity to build their own flows to move data, run analysis tasks, and share outputs with collaborators. We will also illustrate how these flows can feed into downstream data portals, science gateways, and data commons, enabling search and discovery of data by the broader community.

___________________________________

[1] Chard K, Dart E, Foster I, Shifflett D, Tuecke S, Williams J. (2017) The Modern Research Data Portal: A design pattern for networked, data-intensive science. PeerJ Articles:cs-144 https://peerj.com/articles/cs-144/