MINOS Project Uses Globus for Nuclear Nonproliferation Research and Development Project
Globus enables broad and efficient collaboration between scientists at nine DOE laboratories working on the Multi-Informatics for Nuclear Operations Scenarios (MINOS) project. MINOS is a National Nuclear Security Administration (NNSA) Defense Nuclear Nonproliferation Research and Development project to characterize data collected from a research reactor at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. More than 100 researchers from Department of Energy (DOE) laboratories in Tennessee, New Mexico, California, Washington, Idaho, South Carolina, Illinois, and elsewhere collaborate on this project to generate streaming monitoring data, characterize and curate the data, and analyze it for patterns of interest.
Data collection research teams for MINOS operate sensors at the ORNL test bed facility that continuously generate data in a wide variety of formats, sizes, and frequencies. The teams and data are from multiple research domains, including electromagnetic, radiation, seismic, acoustic, thermal, and biota data. Most data are streamed over a wireless network from field sensors onto interim storage servers at ORNL where the curation process begins. First, MINOS collection teams validate, preprocess, and analyze each dataset and generate a standardized metadata file that populates a comprehensive data catalog for the MINOS project. After storing a back-up copy in a project archive, a MINOS data management team at ORNL moves the data from the interim storage servers to an internet-enabled transfer node.
The data are then packaged and transferred to Lawrence Berkeley National Laboratory (LBNL) over the Energy Sciences Network (ESnet) using the Globus web interface. ESnet is a high-bandwidth network that connects more than 40 DOE laboratories and research sites. ESnet combined with Globus enables the teams to transmit large data files securely and efficiently between the laboratories. Globus encryption is used when routing the data to protect it for official use. After the Globus transfer completes, the data management team at LBNL unpacks the data and registers it into a data portal on the Berkeley Data Cloud, a website developed for data dissemination and curation within the MINOS research community. At that point MINOS researchers from multiple DOE laboratories are able to download and analyze the data.
MINOS research teams have been leveraging this process since the project began 3 years ago with very reliable success. Last year a more automated process was piloted to minimize latencies between data collection and dissemination due to manual steps and to test scalability. For this pilot, one MINOS acoustic data stream is routed from the test bed at ORNL to temporary storage hosted on an Amazon Web Service cloud platform. Data are processed, validated, reformatted, and packaged with the necessary metadata in the cloud using an Apache NiFi pipeline developed by Lawrence Livermore National Laboratory. A custom NiFi processor at the end of the pipeline calls the Globus API to initiate the data transfer over ESnet to an endpoint on LBNL’s Berkeley Data Cloud portal servers. This Globus-enabled pipeline is fully automated and moves thousands of files a day. It is capable of providing near real-time data access with inline quality control and analysis, and plans are in place to extend the pipeline for additional data types and processing steps.
Globus is critical to MINOS data movements and enables data sharing and scientific collaboration that are essential for MINOS to reach its scientific goals. So far, more than 7.8 million files of MINOS data have been moved over Globus to provide researchers at multiple national laboratories with nearly 300 TB of valuable scientific information for analysis and new discovery.