Simplifying Data Movement Between Supercomputing Facilities to Accelerate Heavy Ion Research

As one of the methods used by the STAR experiment for moving large data sets, Globus is assisting scientists in their effort to understand the origins of our universe by studying the properties of strongly interacting matter.

An acronym for Solenoidal Tracker At RHIC (the Relativistic Heavy Ion Collider at Brookhaven National Lab), STAR tracks the thousands of particles produced in heavy ion collisions, searching for signs of a rare form of matter that is believed to have last existed just after the Big Bang. The primary physics goal of STAR is to study this new phase of matter at extreme energy densities known as a quark-gluon plasma (QGP), bringing about a better understanding of the universe in its earliest stages, which may lead to discoveries about the nature of the universe as it exists today.

A collaboration of over 500 scientists and engineers representing 54 institutions in 12 countries, the scope of STAR experiment is massive and getting larger – hundreds of terabytes of data per year are generated and moved between its computing facilities in a production environment.

However, in certain ‘edge cases’ the production method isn’t ideal or even available – these cases include production datasets that do not conform to the normal pipeline system, Common-use data not managed by the experiment but by smaller working group, or individual scientist’s datasets. In these cases, the common feature is a terabyte-scale data set that needs to be accessed at both sites & archived on one or both HPSS systems.

Globus can meet this challenge for STAR by simplifying the file transfer process, allowing individual users the ability to easily move significant amounts of data between STAR facilities at RHIC and NERSC, with minimal setup.

Several STAR scientists have begun using Globus to move 500 GB or more of files, which with Globus is as easy as a few mouse clicks or a couple of commands issued on a personal laptop. The system is easy to set up, provides simple one-stop proxy management features, and requires little intervention or monitoring.

By addressing these edge case data movement issues, Globus is helping improve collaboration and data sharing among STAR scientists and facilities, so trivial but time-consuming tasks like file movement don’t get in the way of discovery. At present, STAR scientists have determined that Globus is an attractive tool for managing certain data transfer needs and are recommending Globus to additional users.

Quotes:

  • “I moved 400 GB of files and didn’t even have to think about it.”
  • “I routinely have to move hundreds of gigabytes of data – Globus makes it easy, so I can execute these transfers with very little effort.”
  • “I’ve been most impressed with both Globus's ease of use and its throughput.  With Globus it is very trivial to move data – there’s almost nothing to it other than specifying your file source and destination. Also, at 10s of MB per second, the transfer speed is certainly sufficient for this part of our work.”