The GridFTP Advantage
The Globus GridFTP reference implementation is widely used for securely moving massive amounts of data in the scientific and many commercial communities. There are multiple reasons that give GridFTP this advantage:
Performance:
Typical GridFTP provides order of magnitude performance improvements compared to standard FTP. GridFTP achieves good performance by using parallel streams to minimize bottlenecks inherent in TCP/IP and non-TCP protocols such as UDT. It also allows the users to set optimal TCP buffer size for a transfer. See Performance Tuning of GridFTP.Cluster-to-cluster data movement:
GridFTP can do coordinated data transfer by using multiple computer nodes at the source and destination. This approach can increase performance by another order of magnitude. See also GridFTP Striping Architecture and Performance.Security:
The Globus GridFTP framework supports various security options, including Grid Security Infrastructure (GSI), anonymous access, username- and password-based security such as regular FTP servers, SSH-based security. The GSI security mechanism provides the capability of delegated authority via X.509 certificates. Delegated authority is critical for large collaboration efforts and enables single sign-on in virtual organizations, thereby eliminating the need for the user to enter passwords onto what can be hundreds of different sites. For additional information on Globus security see the official documentation.Partial File Transfer:
In many cases in the scientific community it is expedient to download only portions of a large file, instead of the entire file. GridFTP supports this capability by specifying the byte position in the file to begin the transfer.Third Party Control and Reliable/Restartable Data Transfer:
In many cases, reliability is more important than speed. In fact, the desire for speed is born out of the user needing to baby-sit the transfer, not some intrinsic application need. To enable reliability, the GridFTP server automatically sends restart markers (checkpoints) to the client. If the transfer has a fault, the client may restart the transfer and provide the markers received. The server will restart the transfer, picking up where it left off based on the markers. The Reliable File Transfer (RFT) service goes one step further by providing a service interface (job submission like interface) and writing the restart markers to a database so that it can survive a local fault. Additionally clients are able to act as a third-party to initiate transfers between remote sites.Modular and Extensible:
Interaction with Storage Systems:
The Data Storage Interface (DSI) completely abstracts away the underlying storage. If the user can implement the DSI, then a GridFTP-compliant server can be put in front of the source of data. We currently have DSIs for POSIX filesystems, HPSS and the Storage Resource Broker (SRB).Interaction with the Network:
The Globus eXtensible IO (XIO) system utilizes a read, write, open, close abstraction that Globus GridFTP is able to leverage in order to be transport-protocol-agnostic. Hence, in environments where it makes sense, protocols much more aggressive than TCP can be utilized. To meet more specific extensibility needs, we also provide easy-to-use development libraries. Currently there is an XIO driver for UDT and GridFTP can use UDT as an alternate transport mechanism for TCP.
Multicasting and Overlay Routing:
The Globus GridFTP framework is capable of doing one-source-to-many-destinations transfers. It also provides support to form an overlay network of GridFTP servers.