GT 3.9.4 Component Guide to Public Interfaces: RFT

Semantics and syntax of APIs

Programming Model Overview

The Reliable Transfer Service (RFT) is a WSRF based service that provides interfaces for controlling and monitoring third party file transfers using GridFTP servers. The client controlling the transfers (in this case RFT )  is hosted inside of a Grid service so it can be managed using the soft state model. It is essentially a reliable and recoverable version of the GT2 globus-url-copy tool and more. In 3.9.4 RFT can also perform file deletion, recursive directory deletion operations. It is also used by GRAM to perform all the staging operations and cleanup operations.

Component API

Some relevant API :

Semantics and syntax of the WSDL

Protocol overview

RFT Service implementation in 3.9.4 uses standard SOAP messages over HTTP to submit and manage a set of 3rd party GridFTP transfers and to delete files using GridFTP. The user creates a RFT resource by submitting a list of URL pairs of  files that need to be transferred/deleted  to RFT Factory service. The user also specifies the time to live for the resource the user is creating to a GT 3.9.4 Container in which RFT is deployed and configured. The resource is created after the user is properly authorized and authenticated. RFT service implementation exposes operations to control and manage the transfers (the resource). The operations exposed by both RFT factory and RFT service are briefly described below. The resource the user created also exposes the state of the transfer as a resource property to which the user can either subscribe for changes or poll for the changes in state periodically using standard command line clients.

Operations

Reliable File Transfer Factory Service : Used to create a Reliable File Transfer resource. The operations exposed by the factory are as follows:

  • createReliableFileTransfer : Creates a Reliable File Transfer Resource.
    • Input Parameters: Initial Termination time , Transfer Request or Delete Request
    • Output parameters:  Termination time, Current time, Endpoint reference of the Resource created. ( This should be stored by the user as it is needed to query the status of the resource and to perform any further operations on the resource.
    • Fault: createReliableFileTransferFault:

Reliable File Transfer Service : Used to manage the Resource created using the RFT Factory Service. The operations exposed by the service are as follows:

  • start: Starts executing the transfers/deletes
    • Input  Parameters: None
    • Output Parameters: None
    • Fault: RepeatedlyStartedFault:
  • getStatus: To get the status of a particular file.
    • Input Parameters:  A source URL of the file that is part of the request.
    • Output Parameters: Transfer Status Type
    • Fault: RFTDatabaseFault
  • getStatusSet : To get the status of a set of files in a request
    • Input Parameters:  int  from ( the relative position of the transfer in the request)  and int offset ( Number of files queried)
    • Output Parameters:  An array of TransferStatusType
    • Fault: RFTDatabaseFault
  • cancel:  To cancel a transfer that is part of a resource.
    • Input Parameters : int from ( the relative position of the transfer in the request ) int to
    • Output Parameters: None
    • Fault: RFTDatabaseFault

Resource properties

RFT Factory Resource Properties:

  • ActiveResourceInstances: A dynamic resource property of total number of active rft resources in the container at a given point of time.
  • TotalNumberOfTransfers: A dynamic resource property of total number of transfers/deletes performed since the RFT service is deployed in this container
  • TotalNumberOfActiveTransfers: A dynamic resource property of number of active transfers across all rft resources in a container at a given point of time.
  • TotalNumberOfBytesTransferred: A dynamic resource property of total number of bytes transferred by all RFT resources created since the deployment of the service.
  • RFTFactoryStartTime: Time when the service was deployed in the container. Used to calculate uptime.
  • DelegationServiceEPR: The end point reference of the Delegation resource that holds the delegated credential used in executing the resource.
RFT Resource Properties :
  • RequestStatusProperty : represents the current state of the resource ( Active, Pending, Failed, Finished). It also includes the last fault message encountered while executing the request.
  • OverallStatusProperty:  provides current state of the transfer by providing number of Transfers done, pending, active, failed, cancelled and retrying. It also contains the fault message, if any, raised during a transfer.
  • TotalBytes: provides the total number of bytes transferred by the resource
  • TotalTime: provides the total time taken to transfer the above mentioned total bytes.

Faults

Faults thrown by RFT Factory :

  • createReliableFileTransferFault: All the errors encountered during the creation of the RFT resource are mapped to this fault. Any security related errors are caught even before the factory and are thrown to the user/client.
Faults thrown by RFT Service:
  • RepeatedlyStartedFault: This is raised if a client calls start more than once on a resource
  • RFTDatabaseFault: Thrown when the service is unable to find the resource the user/client is querying for.

Schema Definition

You can find links to all the RFT schemas here

Command-line tools

Command line tool: rft for RFT transfers

Tool description

Submits a transfer to the Reliable File Transfer Service and prints out the status of the transfer on the console.

Command syntax and options

rft  [-h <host-ip of the container defaults to localhost> 
-port <port, defaults to 8080>
-l <lifetime for the resource default 60mins> 
-m <security mechanism. 'msg' for secure message or 'conv' for 
 secure conversation and 'trans' for transport. Defaults to 
   secure transport.>
-p <protection type, 'sig' signature and 'enc' encryption, 
 defaults to signature >
-z <authorization mechanism can be self or host. default self> 
-file <file to write EPR of created Reliable  File Transfer Resource]> 
-f <path to the file that contains list of transfers>

This is a sample transfer file that the command-line client will be able to parse. It can also be found in $GLOBUS_LOCATION/share/globus_wsrf_rft_client/ along with other samples for directory transfers and deletes (lines starting with # are comments):

#true=binary false=ascii
true
#Block size in bytes
16000
#TCP Buffer size in bytes
16000
#Notpt (No thirdPartyTransfer)
false
#Number of parallel streams
1
#Data Channel Authentication (DCAU)
true
# Concurrency of the request
1
#Grid Subject name of the source gridftp server
/DC=org/DC=doegrids/OU=People/CN=Ravi Madduri 134710
#Grid Subject name of the destination gridftp server
/DC=org/DC=doegrids/OU=People/CN=Ravi Madduri 134710
#Transfer all or none of the transfers
false
#Maximum number of retries
10
#Source/Dest URL Pairs
gsiftp://localhost:5678/tmp/rftTest.tmp
gsiftp://localhost:5678/tmp/rftTest_Done.tmp

Limitations

This command line client is very dumb and simple and does not do any intelligent parsing of various command line options and the options in the sample transfer file. It works fine if used in the way documented here.

Command-line tool: rft-delete for RFT deletes

Tool description

This command-line tool is used to submit a list of files to be deleted.

Command and options

rft-delete [-h <host-ip of the container default localhost> 
-port <port, defaults to 8080>
-l <lifetime for the resource default 60mins> 
-m <security mechanism. 'msg' for secure message or 'conv' for 
 secure conversation and 'trans' for transport. Defaults to 
   secure transport.>
-p <protection type, 'sig' signature and 'enc' encryption, 
 defaults to signature >
-z <authorization mechanism can be self or host. default self> 
-file <file to write EPR of created Reliable  File Transfer Resource]> 
-f <path to the file that contains list of transfers>

This is a sample file that the command line client will be able to parse, it can also be found in $GLOBUS_LOCATION/share/globus_wsrf_rft_client/ along with other samples for directory transfers and deletes (lines starting with # are comments):

# Subject name (defaults to host subject)
  /DC=org/DC=doegrids/OU=People/CN=Ravi Madduri 134710
  gsiftp://localhost:5678/tmp/rftTest_Done.tmp
  gsiftp://localhost:5678/tmp/rftTest_Done1.tmp

Limitations

No limitations with this commandline tool.

Overview of Graphical User Interface

There is no GUI for RFT service in this release.

Semantics and syntax of domain-specific interface

Please look here for information on how RFT schemas look like.

Configuration interface

Configuration overview

PostgreSQL (Version 7.1 or greater ) needs to be installed and configured  for RFT to work. The instructions on how to install/configure postgresql can be found here (add a link to instuctions here).
  1. Install Postgresql. Configure the postmaster daemon so that it accepts TCP connections. This can be done by adding -o "-i" switch to postmaster script.
  2. To create the database that is used for RFT, run:
    createdb rftDatabase
  3. To populate the RFT database with appropriate schemas, run:
    psql -d rftDatabase -f $GLOBUS_LOCATION/share/globus_wsrf_rft/rft_schema.sql

    Now that you have created a database to store RFT's state, the following steps configure RFT to find the database:
  4. Open $GLOBUS_LOCATION/etc/globus_wsrf_rft/jndi-config.xml
  5. Find the dbConfiguration section under ReliableFileTransferService <service> section.
  6. Change the connectionString to point to the machine on which you installed Postgres and name of the database you used in step 2.
    If you installed Postgres on the same machine as your Globus install, the default should work fine for you.
  7. Change the userName to the name of the user who owns/created the database and do the same for the password. (It also depends on how you configured your database.)
  8. Don't worry about the other parameters in that section. The defaults should work fine for now.
  9. Edit the configuration section under ReliableFileTransferService. There are two values that can be edited in this section.
    • backOff  : Time in seconds you want RFT to backoff before a failed transfer is retried by RFT. Default should work fine for now.
    • maxActiveAllowed:  This is the number of transfers the container can do at given point. Default should be fine for now.

Frequent configuration problems

Problem: If RFT is not configured properly to talk to a PostgreSQL database, you will see this message displayed on the console when you start the container :

"Error creating RFT Home: Failed to connect to database ... 
Until this is corrected all RFT request will fail and all GRAM jobs that require staging will fail". 

Solution: Usual mistake is Postmaster is not accepting TCP connections which  means that you have to restart Postmaster with -i option  ( see step 1)

Syntax of the interface

The security configuration of the service can be modifying the security descriptor. It allows for configuring in the credentials that will be used by the service, type of authentication and authorization that needs to be enforced. By default, the following security configuration is installed:

  • Credentials set for use by container is used. If that is not specified, default credentials are used.
  • GSI Secure conversation authentication is enforced for all methods.

Note: Changing required authentication and authorization method will require suitable changes to the clients that contact this service.

To alter security descriptor configuration refer to Security Descriptors. The file to be altered is $GLOBUS_LOCATION/etc/globus_wsrf_rft/security-config.xml

Environment variable interface

  • The only Env variable that needs to be set for RFT is GLOBUS_LOCATION in order to run the command line clients, which should be set to globus installation.