Argonne researchers have developed a pipeline between ALCF supercomputers and Advanced Photon Source experiments to enable on-demand analysis of the crystal structure of COVID-19 proteins.
Key to understanding the coronavirus is unraveling its structure. To this end, Argonne researchers have leveraged the ALCF’s Theta supercomputer to analyze crystallographic images of a protein complex associated with the SARS-CoV-2. Massively parallel systems like Theta are unique in their ability to meet the demands that serial synchrotron crystallography poses for rapid, on-the-fly processing. Enabling Theta for use in on-the-fly processing is a data pipeline constructed around the supercomputer. This pipeline automates data acquisition, analysis, curation, and visualization, transporting results to a repository from which metadata can be extracted for publication.
The pipeline begins with Globus transferring images from the APS to the Theta system. The images are then analyzed and processed using FuncX, a function-as-a-service computation system that organizes the dispatch of individual tasks to available computing nodes. FuncX is subsequently also used to extract metadata about hits, identify crystal diffractions, and generate visualizations depicting both the sample and hit locations. After this the raw data, metadata, and related visualizations are published to a portal hosted at the ALCF, where they are indexed and made searchable for reuse.
“This pipeline’s deployment between the APS and the ALCF for on-demand analysis has been a tremendous success,. We achieved a processing rate of up to 95 images a second.” This high speed made it possible to deliver instantaneous feedback to experimentalists at the APS"
-Ryan Chard, Argonne