Working together to accelerate discovery across the globe
Researchers now more than ever need a highly scalable, friction-free data management solution to effectively address the complexities of conducting modern research. Careful planning is necessary to avoid research data management obstacles. For example, researchers must be able to handle extremely large data sets; often terabytes and, more recently, petabytes of data, created by powerful instruments and other new tools. Data must be collected and shared rapidly and securely across campus or across the globe. Mundane research data management tasks should be fully automated and essentially “invisible” to researchers, allowing them to focus on the core mission of scientific research. To this end, several countries have set up national cyberinfrastructures that democratize access to advanced data management capabilities for researchers at diverse institutions and facilities.
CANADA: Compute Canada and FRDR - DFDR
Compute Canada, the national organization responsible for advanced research computing in Canada, provides a state-of-the art computing platform and software solutions to researchers across Canada. Recognizing the need for Canadian researchers to find and share data in an easy, reliable and secure way, Compute Canada, in partnership with the Portage Network, developed the Federated Research Data Repository (FRDR) to ensure that Canadian research data is curated, preserved and accessible.
FRDR, together with the Globus platform, solved many of the typical challenges facing researchers, including usability, security and scalability. Now with FRDR in place, researchers can easily access and share data, resulting in faster time to discovery.
The FRDR platform scales to large datasets seamlessly by leveraging Globus transfer and sharing services with a design pattern that decouples the data access mechanism from the publication portal. FRDR can also be used to automatically convert data into preservation-friendly formats, bundle these data with associated metadata, and create 'Archival Information Packages' (AIPs) suitable for long-term preservation.
Globus Auth secures the FRDR platform and enables a federated search service that helps researchers across Canada, and around the world, to discover and access data collections hosted in FRDR and other existing data repositories. FRDR uses a metadata harvester to enable the discovery of Canadian research data stored across many repositories from a single portal. FRDR also provides support for faceted search and other advanced search options.
The Portage Network was initiated in 2015 by the Canadian Association of Research Libraries and is now part of the New Digital Research Infrastructure Organization. Funding in support of FRDR is administered through NDRIO.
SOUTH AFRICA: National Integrated Cyberinfrastructure System (NICIS)
The National Integrated Cyberinfrastructure is a national initiative of the Department of Science and Innovation and implemented by the Council of Scientific and Industrial Research (CSIR) in South Africa. NICIS consists of three pillars: The Centre for High Performance Computing (CHPC) which provides massive parallel processing capabilities and services to researchers in industry and academia, the South African National Research Network (SANReN), which provides high-speed connectivity and advanced networking services, as well as the Data Intensive Research Initiative of South Africa (DIRISA), which implements services that enable sound data management practices and support efficient data-driven scientific and engineering discoveries.
A key objective of NICIS is to enable South Africa to effectively participate in, and lead, large-scale global research and science projects.
NICIS has been running a national Data Transfer (DT) Pilot project to increase the transfer speed and reliability of moving large datasets to and from South African researchers and scientists nationally and internationally. The pilot service has set up Globus endpoints and Data Transfer Nodes (DTNs) at the CHPC and at various other SANReN beneficiary locations within the country, thereby enabling users to quickly and efficiently transfer very large datasets through an easy-to-use interface, and to monitor transfers.
Current focus is on maturing the DT service and growing its-uptake within the South African research and education community, with the aim of making it a production-level service in the future. The possibility of using the new Globus IRODs connector is also being investigated to integrate with DIRISA’s existing IRODs implementation.
Australia: AARNet and the Australian Research & Education Sector
AARnet is Australia’s national research and education network, a not-for-profit provider of network, cyber security, data and collaboration services, owned by the universities in Australia and the Commonwealth Scientific & Industrial Research Organisation (CSIRO). With many publicly-funded research institutes also as AARNet customers, AARNet has a long-tradition of collaborating with the research & education community to develop tools and services that meet the unique needs of researchers.
Various representatives from the Australian R&E sector had approached AARNet highlighting a need for tooling to support researchers managing data generated by instruments, as well as moving data between storage systems and computing facilities. Through the AARNet network, researchers have access to exceptional high-bandwidth connections across Australia and out to the globe - tooling to support the utilisation of this bandwidth has proven necessary in various research use cases. AARNet has established an agreement with Globus to facilitate uptake of Globus in the Australian R&E sector in support of improving research data management in Australia and out to the world.
AARNet is collaborating extensively with the research community, identifying issues with data management and where Globus may be of use. An example of the deployment to date has been within the Characterisation community in Australia where large analytical sample data is required to be processed at computing facilities. Globus endpoints have been established between major Characterisation and computing facilities and researcher use.
NEW ZEALAND: New Zealand eScience Infrastructure - (NeSI)
As researchers increasingly generate and transfer data at expanding rates, New Zealand eScience Infrastructure (NeSI) is responding to growing requirements for a range of data services, including data transfer capabilities.
Since 2014, NeSI has partnered with Globus to offer a high-speed option for transferring large and distributed data nationally and internationally. In 2019, NeSI reviewed its data transfer offering and launched a new and improved National Data Transfer Platform in partnership with Globus and the national research and education network REANNZ. Designed for use with NeSI’s national High Performance Computing (HPC) platforms as well as data storage and research facilities around New Zealand, it was a major step towards a national framework for data sharing and access and supporting international collaboration.
“Bringing the platform online was truly a collaborative effort, involving coordination and cooperation between our international partner Globus, national partner REANNZ, and on a regional scale with several innovative research institutions,” says Brian Flaherty, Data Services Product Manager at NeSI. "Our goal was to lower barriers and to normalise expectations of moving demanding volumes of data to enable data intensive science."
NeSI is invested in building and sustaining capability to operate Globus as a national service provider, and to enable adoption widely across the New Zealand research system. Using the National Data Transfer Platform, researchers can move gigabits of data on a network 1,000 times faster than through a broadband connection. Powered by REANNZ, data transfers can be done at 10 Gbps.
Now more than ever, collaboration is key to maintaining a progressive and sustainable research ecosystem. As the New Zealand research sector looks to answer national science imperatives across institutional boundaries, NeSI seeks to build national capability in running and optimising use of HPC and eResearch infrastructure.
This is where partnerships like NeSI's with Globus, REANNZ, and research institutions across New Zealand come into play.
"Working together, we can better respond to research community needs — be it related to computational power, data management, or other advanced digital research capabilities," says Nick Jones, Director of NeSI. "Building shared understanding helps us connect with a research community’s aspirations and goals and better equip their researchers to deliver new and valuable insights in their fields at both local and global scales."
NeSI has also been collaborating with AARNet (described above), to investigate endpoints at Australian research facilities as part of a phased “whole of Australia research sector” Globus initiative.
Early-stage tests of transfers between the Australian endpoints are underway, with an eye to enabling higher performing trans-Tasman research collaborations.
To learn more:
South Africa: National Integrated Cyberinfrastructure System (NICIS)
New Zealand: New Zealand eScience Infrastructure (NeSI)