PEARC22

July 10 – 14, 2022 (All Day)
  • Boston, MA

Globus is excited to exhibit once again at this year's Practice Experience in Advanced Research Computing (PEARC) conference.

Visit the Globus exhibit

  • Learn what's new at Globus
  • Find out how a Globus subscription can benefit you and your organization
  • Talk to our experts about your research data management needs

Attend a Globus tutorial

Developing Robust and Scalable Next Generation Workflow Applications and Systems

  • Presenter: Kyle Chard, University of Chicago and Argonne National Laboratory
  • Date: Monday, July 11
  • Time: 8:30 a.m. -11:30 a.m. ET
  • Location: Clarendon

Workflow applications are critical to scientific discovery. Technology trends and the convergence of traditional High Performance Computing (HPC) with new simulation, analysis, and machine learning (ML) approaches provides unprecedented opportunities. Traditional approaches to workflow applications and systems development have scalability and robustness limits. The ExaWorks project is building a robust workflows SDK with robust and high-performance technologies as well as well-defined and scalable component interfaces which can be leveraged by new and existing workflow applications and systems. This tutorial will present the ExaWorks SDK, and its constituent components: Flux, Parsl, RADICAL-Cybertools (RCT), and Swift/T. These components are widely used, and available tools for developing workflow applications. This tutorial will outline modern workflow motifs on HPC platforms (e.g., ensemble campaigns, ML-in-the-loop), illustrate science examples of these motifs, and discuss solutions using the ExaWorks SDK. One third of the tutorial is dedicated to presentations from experts, and two thirds are dedicated to hands-on exercises. Attendees will gain practical knowledge to develop best workflow practices to manage large-scale campaigns on the largest supercomputers. At the end of the tutorial, they will be able to apply these tools and techniques to their advanced workflows with minimal programming effort.

Scalable Automation of Data Management Tasks

  • Presenters: Rachana Ananthakrishnan, University of Chicago (Globus)
  • Date: Monday, July 11
  • Time: 8:30-5 p.m. ET
  • Location: Studio 1 Room

Globus is widely used among the PEARC community for reliable data transfer, but a growing number of computationally intensive research activities require commensurate large-scale data management. A common use case is that of high-resolution imaging instruments, e.g., cryoEM and synchrotron beamlines, that require automation of data flows to increase throughput and researcher productivity, as well as to ensure the instrument remains highly utilized. Globus platform services (including Globus Flows and Globus Auth), combined with data distribution platforms that use the Modern Research Data Portal design pattern[1], can greatly simplify the development and execution of automated data management tasks in this context.We will describe how Globus platform services facilitate the construction of automated flows using our work with multiple instrument facilities as exemplars. Attendees will have the opportunity to build their own flows to move data, run analysis tasks, and share outputs with collaborators. We will also illustrate how these flows can feed into downstream data portals, science gateways, and data commons, enabling search and discovery of data by the broader community.

___________________________________

[1] Chard K, Dart E, Foster I, Shifflett D, Tuecke S, Williams J. (2017) The Modern Research Data Portal: A design pattern for networked, data-intensive science. PeerJ Articles:cs-144 https://peerj.com/articles/cs-144/

 

Building Enduring Cyberinfrastructures - The Role of Professional Research Software Engineers

  • Date: Wednesday, July 13
  • Time: 3-4 p.m.
  • Location: White HIll

Panelists:
     Rachana Ananthakrishnan (University of Chicago, Executive Director &   Head of Products - Globus)
     Blake Joyce (Manager – Data Science, University of Alabama at   Birmingham)
     Sandra Gesing (UIC, Scientific Outreach and DEI Lead, science   gateways)
     Karen Tomko (Ohio Supercomputing Center, Director of Research & Manager   of Scientific Applications)
     Julia Damerow (Arizona State University, Lead Scientific Software   Engineer)
     Julian Pistorius(The Exosphere   Project, Co-Founder)

     Moderators: Christina Maimone, Manager, Research Data Services,   Northwestern University; Chris Hill (MIT).
     Description
     Research IT, also known as CyberInfrastructure or CI, is a strategic   resource for research institutions and researchers. A significant part of the   research CI portfolio is software: commercial applications; custom-developed   applications, tools, and scripts; and community software shared by many   researchers in a particular field of study.The volume of this software is continually increasing, with new tools   and applications emerging and existing tools evolving. The emerging   professionals who develop and maintain this software—increasingly described   as Research Software Engineers (RSEs)—are a vital human resource in research   teams of all sizes. Creating an environment that fosters the development of   RSEs and their careers is an increasingly important part of managing a   research organization and of sustaining local, national and global CI   ecosystems. Further roles for sustaining CI ecosystems include HPC   facilitators and data scientists, for example. Some tasks and   responsibilities overlap between these roles and fostering career paths for   one group, can also benefit other groups with non-traditional academic career   paths.

     This panel brings together representatives from some of the leading drivers   of sustained Cyberinfrastructures from different research areas. The panel   will discuss the goals and strategies for building, developing, managing, and   sustaining RSE teams that can sustain and evolve enduring advanced   Cyberinfrastructures.