Tips and Advice from our Experts
Your QuickStart Guide provides the info and links you need to get basic subscription services (and users) up and running quickly. But there are some pointers you simply won’t find in existing materials that are invaluable to having a great Globus experience. We’ve listed these below – for details, email your Globus contact or send a note to support.
- Add Globus information to your documentation and web pages
- Plan your Data Transfer Node (DTN) deployment and scale-out from the get-go
- Add descriptive metadata to your endpoint(s) / collection(s) definitions in Globus
- Integrate your institution’s identity provider (IdP) with Globus
- Configure your endpoints to use your institution’s identity provider (IdP) for authentication
- Run test transfers to ensure adequate performance—and tweak network use parameters if necessary to improve
- Enable and configure sharing on your endpoint(s)
- Understand and use Roles for endpoints and groups
- Be aware of the Management Console and make sure the right people have access to your endpoints
- Make sure you’re aware of Globus updates and news
If your users can't find Globus info on your site, how will they get the help they need? Ensuring users can easily find Globus resources reduces the admin burden and helps speed and broaden adoption.
Refer to the Welcome Kit Communication Guide for details and tips, including examples of how other organiztions have handled this.
- Put your Data Transfer Node(s) in a Science DMZ: The goal of the Science DMZ is to optimize the network for high-performance scientific applications and data traffic rather than for general-purpose business systems. This ensures a greater level of performance, control and security than one would get simply utilizing a general enterprise network architecture. More information on ScienceDMZ concepts and implementation can be found here.
- Select your DTN hardware and operating system: While hardware requirements for your DTNs will be custom and therefore depend on your specific scenario, guidelines and information regarding the reference hardware for DTN implementation recommended by ESnet can be found here. Once set up, information on how to tune your DTN for optimal performance can be found here. For Globus Connect Server Version 4.x (GCSv4), supported operating systems can be found here. For Globus Connect Server Version 5.x (GCSv5), supported operating systems can be found here (https://docs.globus.org/globus-connect-server-v5-installation-guide/#supported_linux_distributions). There are lists of general prerequisites for successful GCS installation in the sections of the links above, including firewall configuration which we elaborate on below.
- Make sure you have the right firewall configuration before starting to deploy an endpoint: While seemingly obvious, most of our support tickets are the result of improper or incomplete firewall configurations. For GCSv4, required TCP ports that must be opened in the firewall are listed here. For GCSv5, required TCP ports that must be opened in the firewall are listed here.
- Plan for usage growth: GCSv4 is built on a distributable architecture that allows globus-connect-server-io (the package that configures a GridFTP server), globus-connect-server-id (the package that is used to configure a MyProxy identity service), and globus-connect-server-web (the package that is used to configure a MyProxyOAuth identity service) to run on different host machines for performance, security and redundancy. In addition an endpoint can be configured with multiple i/o nodes for performance and redundancy in high data volume use case scenarios. Information on these different packages can be found here. Information on distributing i/o nodes across multiple host machines can be found here. Upcoming versions of GCSv5, 5.4 and above will have a similar architecture.
In order to ensure researches can easily and confidently identify your endpoint, we strongly recommend providing values for all of the following, which are accessible and editable via the “Edit Attributes” button on the “Overview” tab of the endpoint / collection:
- Display Name ...make this something that your users will easily recognize
- Description ...a few key details about your system and who can access it
- Keywords ...that researchers may use when searching for your endpoint
- Endpoint Info Link ...link to your web site containing additional info
- Contact E-mail ...address that users contact for help with this endpoint
- Organization ...tells users who owns/operates the endpoint
Doing so allows your users to access the Globus App using the institutional identities they already have in place.
If your institution is an InCommon or eduGAIN member or is looking to become one (note: most of what's said below is detailed in this FAQ):
The easiest way for your identity provider to appear in the Globus login pulldown is to join InCommon or eduGAIN federation and release attributes that will allow our authentication system to use your IdP.
- Join InCommon by going here
- Release the Research & Scholarship attributes1
- Once your organization’s system is configured to release the required attributes, it will appear in the list of institutions on the Globus login page within two business days. This integration is done via CILogon service, and the login flow uses the service.
1 We leverage CILogon for this integration, and the Research and Scholarship attributes release can be validated by checking if your organization is listed at this link with <RandS>1</RandS>.
If your institution is NOT an InCommon or eduGAIN member, nor looking to become one:
If your organization is not part of the InCommon Federation, you can request to add your organizational login as an alternate identity provider in Globus. This process is only available to Globus Subscribers and there is an additional annual charge for us to maintain this alternate identity provider integration.
Your system must support the OpenID Connect protocol, and be registered with Globus as a trusted identity provider. Please submit this form so we can register your system. Once the request is vetted and approved, your identity provider will be available as an option for login.
In order to ensure no local credentials are visible to the Globus Service, you want to use you institutions identity provider or local OAuth server for endpoint authentication. GCSv4 supports multiple ways to configure authentication. In order to ensure no user credentials are visible to Globus, you should use you institution's identity provider or local MyProxy OAuth server for endpoint authentication. This document explains the nuances of the authentication methods for GCSv4. Instructions on how to configure the identity methods for GCSv4 can be found here.
With GCSv5, only federated IdPs are supported for authentication, and user credentials never flow through Globus.
Run test transfers to ensure adequate performance—and tweak network use parameters if necessary to improve
In order to ensure you are getting the most from your fast network connection / Science DMZ you will want to do some performance testing and perhaps tweak your endpoint’s Network Use Options.
- We suggest starting by testing basic endpoint functionality. Instructions to do so can be found here for GCSv4 and here for GCSv5.
- Test data sets (collections of files of various sizes) are available at the following endpoints (graciously made available by the ESnet team2). These endpoints lie behind Science DMZs and are connected to fast network backbones. They are excellent resources for performance testing:
- To learn more about what to expect when transferring data over the WAN, see this page.
- Network Use Options can be configured on managed endpoints only and are set in the “Server” tab of the attribute page for a particular endpoint. The meaning and application of the parameters are detailed here for GCSv4 and here for GCSv5. Important! Before changing network use parameters on your Globus endpoint:
- Run some test transfers with the default values
- Ensure the wide area network path is clean by using tools like perfSONAR
- Check the network use parameters on the other (destination) endpoint(s) to ensure they’re not limiting throughput. Setting your endpoint’s parameters to very high values may have no effect if the values on the destination endpoint are much lower.
2 Note that only endpoints located on networks ESnet has whitelisted will be able to do transfers with ESnet endpoints. ESnet states that they allow endpoints located on what they consider to be R&E networks to conduct transfers with their endpoints in their doc here: https://fasterdata.es.net/performance-testing/DTNs/. However, we have seen issues where endpoints located on non-whitelisted networks were not able to conduct transfers with ESnet endpoints. If you run into issues, contact firstname.lastname@example.org.
- Sharing is one of the hallmarks of the Subscription Globus Service. It allows your researchers to share data and collaborate with peers at other institutions, but unlike other methods such as setting up ftp / sftp sites, emailing large files, sending disks via courier, etc… it takes the administrative burden off of already busy research computing center staff and provides a safe, secure, fast and auditable mechanism for file sharing.
- GCSv4 instructions for enabling and configuring sharing are detailed here.
- GCSv5 instructions for setting up a storage gateway can be found here and instructions for configuring a guest collection can be found here.
- For redundancy, security and flexibility Globus allows you to assign roles on your managed endpoint to other Globus users. Administrator, Access Manager, Activity Manager and Activity Monitor roles may be delegated to others. This is especially handy for endpoints that serve a community of researchers. Additional information on endpoint roles can be found here.
- Similarly Globus Groups allow role designation for the management of groups. This feature is particularly helpful where a group is associated with a community of researchers with multiple people in administrative or managerial positions. Information on configuration of Globus Groups can be found here -- see section 6 for info on group member roles.
- The Globus Console is a powerful tool for monitoring endpoint performance and managing the tasks running on endpoints.
- Endpoint Roles, detailed in the previous section, enable other Globus users to use the Management Console to monitor and control various aspects of your endpoints. This gives you the flexibility and redundancy to support your Globus users better, especially when someone on your team is unavailable.
Stay informed about what is happening with the product:
- Join user, admin and/or developer mailing lists -- most posts come from other users/admins or from Globus support, and it can be very valuable to review the Q&A.
- Sign up for email alerts by subscribing in the form on this page.
- Follow us on social media:
- Attend the GlobusWorld annual user conference, and/or look for a GlobusWorld Tour workshop in your area.