Submitting jobs to a job scheduler.

1. Delegating credentials

There are three different uses of delegated credentials:

  1. for use by the MEJS to create a remote user proxy
  2. for use by the MEJS to contact RFT
  3. for use by RFT to contact the GridFTP servers. The EPRs to each of these are specified in three job description elements -- they are jobCredentialEndpoint, stagingCredentialEndpoint, and transferCredentialEndpoint respectively. Please Job Description Schema Reference and RFT transfer request schema documentation for more details about these elements.

The globusrun-ws client can either delegate these credentials automatically for a particular job, or it can reuse pre-delegated credentials (see next paragraph) through the use of command-line arguments for specifying the credentials' EPR files. Please see the GRAM4 Commands for details on these command-line arguments.

It is possible to use delegation command-line tools to obtain and refresh delegated credentials in order to use them when submitting jobs to GRAM4. This, for instance, enables the submission of many jobs using a shared set of delegated credentials. This can significantly decrease the number of remote calls for a set of jobs, thus improving performance.

The following example shows how to delegate credentials. globus-credential-delegate delegates to the specified delegation factory on lucky0.mcs.anl.gov, prints some information and stores the endpoint reference of the delegated credentials into the file delegCred.epr

[martin@osg-test1 ~]$ globus-credential-delegate \
> -s https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationFactoryService \
> delegCred.epr
Delegated credential EPR:
Address: https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationService
Reference property[0]:
<ns1:DelegationKey xmlns:ns1="http://www.globus.org/08/2004/delegationService">
  55e2a450-58be-11dd-b83c-e4ec640dfe13
</ns1:DelegationKey>

To destroy the delegated credential use wsrf-destroy:

[martin@osg-test1 jobs]$  wsrf-destroy -e delegCred.epr 
Destroy operation was successful

For more information about the delegation command-line tools see Command-line tools

2. Local resource managers interfaced by a GRAM4 installation

A GRAM4 instance can interface to more than one local resource manager (LRM), as shown in the previous section. A user can explicitly specify what LRM should be used for a job. But in a larger Grid it might be confusing for users to remember which LRM's are available on which machines.

That's why GRAM4 configures a default local resource manager, which is used for job submission if the client didn't explicitly specify one.

2.1. Finding available local resource managers

You can check the resource property availableLocalResourceManagers of a GRAM4 factory service to get that information. Replace host and port in the below example to query against other containers:

[martin@osg-test1 ~]$ globus-wsrf-get-property \
  -s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  "{http://www.globus.org/namespaces/2008/03/gram/job}availableLocalResourceManagers"

The result on that machine is (formatted for better readability) shows that the local resource managers Fork, Multi, Condor and PBS are available:

<ns1:availableLocalResourceManagers 
      xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
  <ns1:localResourceManager>Fork</ns1:localResourceManager>
  <ns1:localResourceManager>Multi</ns1:localResourceManager>
  <ns1:localResourceManager>Condor</ns1:localResourceManager>
  <ns1:localResourceManager>PBS</ns1:localResourceManager>
</ns1:availableLocalResourceManagers>

A more typical result in a production environment is probably Fork, Multi and just one additional LRM like Condor, PBS or LSF.

2.2. Finding the default local resource manager

You can check the resource property defaultLocalResourceManagers of a GRAM4 factory service to get that information. Replace host and port in the below example to query against other containers:

[martin@osg-test1 ~]$ globus-wsrf-get-property \
  -s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  "{http://www.globus.org/namespaces/2008/03/gram/job}localResourceManager"

The result on that machine shows that PBS is the default local resource managers:

<ns1:localResourceManager xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
    PBS
</ns1:localResourceManager>

3. Submitting Jobs Specified in JDD

3.1. Simple interactive job

Use the globusrun-ws program to submit a simple job without writing a job description document. Use the -c argument, a job description will be generated assuming the first arg is the executable and the remaining are arguments. For example:

% globusrun-ws -submit -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:4a92c06c-b371-11d9-9601-0002a5ad41e5
Termination time: 04/23/2005 20:58 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.

Confirm on the server-side that the job worked by verifying the file was touched:

% ls -l ~/touched_it 
-rw-r--r--  1 smartin globdev 0 Apr 22 15:59 /home/smartin/touched_it

% date
Fri Apr 22 15:59:20 CDT 2005

Note: You did not tell globusrun-ws where to run your job, so the default of localhost was used.

Also note, that globusrun-ws destroyed the job after it was fully processed.

We call this kind of job interactive, because globusrun-ws does not return after submission. It subscribes for status update notifications of the job and informs the user about a status change as soon as it changes. Once it gets the information the the job has been fully processed it destroys the job, which means that internal state belonging to the job is cleaned up on the server-side.

3.2. Streaming output

A user can request that the output of the program is sent back directly to the client as soon as it's available. This is useful if a user does not want to do additional file staging for a quick job. To enable this, specify the -s option.

[martin@osg-test1 ~]$ globusrun-ws -submit \
    -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
    -s -c /bin/echo hello world!
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:1731f602-22fe-11dd-879c-0013d4c3b957
Termination time: 05/16/3008 04:10 GMT
Current job state: Active
Current job state: CleanUp-Hold
hello world!
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

Note that a GridFTP server must be running on the remote machine (lucky0) to enable streaming.

Note that streaming output adds some overhead to the submission and will probably be significantly slower compared to a job without streaming. An alternative to streaming is to use staging to transport the output of the executable back to the client. This however requires that a GridFTP server is running on the client machine.

3.3. Using a contact string

Use globusrun-ws to submit the same touch job, but this time tell globusrun-ws to run the job on another machine (lucky0.mcs.anl.gov:8443). A GT4 server with GRAM4 installed must run on that machine and listen on port 8443.

% globusrun-ws -submit \
   -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
   -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:3050ad64-b375-11d9-be11-0002a5ad41e5
Termination time: 04/23/2005 21:26 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.

Type globusrun-ws -help to learn the details about the contact string.

3.4. Using a job description

The specification of a job to submit is to be written by the user in a job description XML file.

Here is an example of a simple job description:

<job>
    <executable>/bin/echo</executable>
    <argument>this is an example_string </argument>
    <argument>Globus was here</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

Tell globusrun-ws to read the job description from a file, using the -f argument:

% bin/globusrun-ws -submit -f simple.xml
    Submitting job...Done.
    Job ID: uuid:c51fe35a-4fa3-11d9-9cfc-000874404099
    Termination time: 12/17/2004 20:47 GMT
    Current job state: Active
    Current job state: CleanUp
    Current job state: Done
    Destroying job...Done.
    

Note the usage of the substitution variable ${GLOBUS_USER_HOME} which resolves to the user home directory.

Here is an example with more job description parameters:

<?xml version="1.0" encoding="UTF-8"?>
<job>
    <executable>/bin/echo</executable>
    <directory>/tmp</directory>
    <argument>12</argument>
    <argument>abc</argument>
    <argument>34</argument>
    <argument>this is an example_string </argument>
    <argument>Globus was here</argument>
    <environment>
        <name>PI</name>
        <value>3.141</value>
    </environment>
    <stdin>/dev/null</stdin>
    <stdout>stdout</stdout>
    <stderr>stderr</stderr>
    <count>2</count>
</job>

Note that in this example, a <directory> element specifies the current directory for the execution of the command on the execution machine to be /tmp, and the standard output is specified as the relative path stdout. The output is therefore written to /tmp/stdout:

% cat /tmp/stdout
    12 abc 34 this is an example_string  Globus was here
    

3.5. Using a contact string in the job description

Instead of specifying the contact string on the command-line, you can also put it in the job description:

<job xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <factoryEndpoint>
      <wsa:Address>
          https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
    </factoryEndpoint>
    <executable>/bin/date</executable>
</job>

Submit the job with the following command (assuming the above description has been stored in the file job.xml):

% bin/globusrun-ws -submit -f job.xml
[Note]Note

This time you don't have to specify the -F option.

3.6. Specifying a local resource manager

Note that at this point you didn't specify any local resource manager related information. If a user does not specify anything then the job is run by the default local resource manager, that is defined on the server-side. If an admin e.g. configured Condor as default local resource manager, then the jobs submitted so far will be managed by Condor on the server-side.

Check the section Local resource managers interfaced by a GRAM4 installation to find out which local resource managers are available in a GRAM4 installation and which one is configured as the default.

3.6.1. Submitting to the default local resource manager

As said, if you want to submit a job to the default local resource manager, all you have to do is to just NOT specify any local resource manager in your submission, neither in the job description, nor on the command-line. The above examples show how to do it.

3.6.2. Submitting to a non-default local resource manager

If you want to submit a job to a non-default local resource manager, or if you just want to be explicit in what you specify, you'll have to specify the local resource manager in your submission. Using globusrun-ws, there are two ways to specify a local resource manager:

  • as command-line argument of globusrun-ws (-Ft <lrm>)
  • in the factoryEndpoint element in the job description

Example: the following job will be submitted to Condor:

globusrun-ws -submit \
  -F osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
  -Ft Condor \
  -c /bin/date

Or with a job description that contains a factoryEndpoint:

<job xmlns:wsa="http://www.w3.org/2005/08/addressing"
    xmlns:gram="http://www.globus.org/namespaces/2008/03/gram/job">
    <factoryEndpoint>
      <wsa:Address>
          https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
      <wsa:ReferenceParameters>
        <gram:ResourceID>Condor</gram:ResourceID>
      </wsa:ReferenceParameters>
    </factoryEndpoint>
    <executable>/bin/date</executable>
</job>

Submit that job (assuming the description is stored in the file myJob.xml):

globusrun-ws -submit -f myJob.xml

3.7. Job with staging

In order to do file staging one must add specific elements to the job description and delegate credentials appropriately (see Section 2, “Delegating credentials”). The file transfer directives follow the RFT syntax, which allows only for third-party transfers. Each file transfer must therefore specify a source URL and a destination URL. URLs are specified as GridFTP URLs (for remote files) or as file URLs (for files local to the service--these are converted internally to full GridFTP URLs by the service).

For instance, in the case of staging a file in, the source URL would be a GridFTP URL (for instance gsiftp://job.submitting.host:2811/tmp/mySourceFile ) resolving to a source document accessible on the file system of the job submission machine (for instance /tmp/mySourceFile ). At run-time the Reliable File Transfer service used by the MEJS on the remote machine would reliably fetch the remote file using the GridFTP protocol and write it to the specified local file (for instance file:///${GLOBUS_USER_HOME}/my_transfered_file, which resolves to ~/my_transfered_file). Here is how the stage-in directive would look like:

<fileStageIn>
    <transfer>
        <sourceUrl>gsiftp://job.submitting.host:2811/tmp/mySourceFile</sourceUrl>
        <destinationUrl>file:///${GLOBUS_USER_HOME}/my_transfered_file</destinationUrl>
    </transfer>
</fileStageIn>

Note: additional RFT-defined quality of service requirements can be specified for each transfer. See the RFT documentation for more information.

Here is an example job description with file stage-in and stage-out:

<job>
    <executable>my_echo</executable>
    <directory>${GLOBUS_USER_HOME}</directory>
    <argument>Hello</argument>
    <argument>World!</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
    <fileStageIn>
        <transfer>
            <sourceUrl>gsiftp://job.submitting.host:2811/bin/echo</sourceUrl>
            <destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl>
        </transfer>
    </fileStageIn>
    <fileStageOut>
        <transfer>
            <sourceUrl>file:///${GLOBUS_USER_HOME}/stdout</sourceUrl>
            <destinationUrl>gsiftp://job.submitting.host:2811/tmp/stdout</destinationUrl>
        </transfer>
    </fileStageOut>
    <fileCleanUp>
        <deletion>
            <file>file:///${GLOBUS_USER_HOME}/my_echo</file>
        </deletion>
    </fileCleanUp>
</job>

Note that the job description XML does not need to include a reference to the schema that describes its syntax. As a matter of fact it is possible to omit the namespace in the GRAM job description XML elements as well. The submission of this job to the GRAM services causes the following sequence of actions:

  1. The /bin/echo executable is transfered from the submission machine to the GRAM host file system. The destination location is the HOME directory of the user on behalf of whom the job is executed by the GRAM services (see <fileStageIn>).
  2. The transfered executable is used to print a test string (see <executable>, <directory> and the <argument> elements) on the standard output, which is redirected to a local file (see <stdout>).
  3. The standard output file is transfered to the submission machine (see <fileStageOut>).
  4. The file that was initially transfered during the stage-in phase is removed from the file system of the GRAM installation (see <fileCleanup>).

Submit that job (assuming the description is stored in the file myJob.xml):

globusrun-ws -submit -S -f myJob.xml

The flag -S tells globusrun-ws to delegate credentials so that Gram4 can call the file transfer service RFT on behalf of the submitting user, and that RFT can interact with the gridftp servers on behalf of the submitting user.

If you already delegated credentials (see Delegating credentials for how to delegate a credential) and have an endpoint reference of that delegated credentials stored in the file delegCred.epr and want them to be used for the transfers instead of globusrun-ws delegating new credentials, you can tell globusrun-ws to use your credentials:

globusrun-ws -submit -Sf delegCred.epr -Tf delegCred.epr -f myJob.xml

The -Sf flag tells that the specified credential is to be used by Gram4 to call RFT on behalf of the user, and the -Tf flag tells that the specified credential is to be used by RFT to interact with the GridFTP servers.

3.8. Specifying a local user id in the job description

If a user has more than one user account on a server and the distinguished name (DN) of the user's certificate is mapped to all these user accounts, a user can specify which local account should be used by GRAM4 for the job submission. By default the first local user account that is defined is used for job submission. If this is not the one that should be used the user must explicitly specify the account to be used. The following dummy job description shows how to do this:

<job>
    <localUserId>stu</localUserId>
    <executable>/bin/date</executable>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

3.9. Using substitution variables

To allow for customization of values, such as paths, on a per-job basis; a job description substitution variable named "GLOBUS_JOB_ID" can be used.

For example:

<job>
    <executable>/bin/date</executable>
    <stdout>/tmp/stdout.${GLOBUS_JOB_ID}</stdout>
    <stderr>/tmp/stderr.${GLOBUS_JOB_ID}</stderr>
    <fileStageOut>
        <transfer>
            <sourceUrl>file:///tmp/stdout.${GLOBUS_JOB_ID}</sourceUrl>
            <destinationUrl>gsiftp://mymachine.mydomain.com/out.${GLOBUS_JOB_ID}</destinationUrl>
        </transfer>
    </fileStageOut>
</job>

More information about substitution variables can found here.

3.10. Using custom job description extensions

Basic support is provided for specifying custom extensions to the job description. There are plans to improve the usability of this feature, but at this time it involves a bit of work.

Specifying the actual custom elements in the job description is trivial. Simply add any elements that you need between the beginning and ending extensions tags at the bottom of the job description as in the following basic example:

    <job>
        <executable>/home/user1/myapp</executable>
        <extensions>
            <myData>
                <flag1>on</flag1>
                <flag2>off</flag2>
            </myData>
        </extensions>
    </job>
    

To handle this data, you will have to alter the appropriate Perl scheduler script (i.e. $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/fork.pm for the Fork scheduler, etc...) to parse the data returned from the $description->extensions() sub.

For more information about extensions see the Extensions section.

3.11. Multi-Job

The job description XML schema allows for specification of a multijob i.e. a job that is itself composed of several executable jobs, which we will refer to as subjobs (note: subjobs cannot be multijobs, so the structure is not recursive). This is useful for instance in order to bundle a group of jobs together and submit them as a whole to a remote GRAM installation.

Note that no relationship can be specified between the subjobs of a multijob. The subjobs are submitted to job factory services in their order of appearance in the multijob description.

Within a multijob description, each subjob description must come along with an endpoint for the factory to submit the subjob to. This enables the at-once submission of several jobs to different hosts. The factory to which the multijob is submitted acts as an intermediary tier between the client and the eventual executable job factories.

Here is an example of a multijob description:

<?xml version="1.0" encoding="UTF-8"?>
<multiJob xmlns:wsa="http://www.w3.org/2005/08/addressing">
    <factoryEndpoint>
       <wsa:Address>
          https://localhost:8443/wsrf/services/ManagedJobFactoryService
      </wsa:Address>
    </factoryEndpoint>
    <directory>${GLOBUS_LOCATION}</directory>
    <count>1</count>

    <job>
       <factoryEndpoint>
         <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
       </factoryEndpoint>
       <executable>/bin/date</executable>
       <stdout>${GLOBUS_USER_HOME}/stdout.p1</stdout>
       <stderr>${GLOBUS_USER_HOME}/stderr.p1</stderr>
       <count>2</count>
    </job>

    <job>
       <factoryEndpoint>
         <wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
       </factoryEndpoint>
       <executable>/bin/echo</executable>
       <argument>Hello World!</argument>        
       <stdout>${GLOBUS_USER_HOME}/stdout.p2</stdout>
       <stderr>${GLOBUS_USER_HOME}/stderr.p2</stderr>
       <count>1</count>
    </job>
</multiJob>

Submit the multi-job with the following command:

% bin/globusrun-ws -submit -f test_multi.xml
    Delegating user credentials...Done.
    Submitting job...Done.
    Job ID: uuid:bd9cd634-4fc0-11d9-9ee1-000874404099
    Termination time: 12/18/2004 00:15 GMT
    Current job state: Active
    Current job state: CleanUp
    Current job state: Done
    Destroying job...Done.
    Cleaning up any delegated credentials...Done.
[Note]Note

When you submit a multi-job you don't have to specify the local resource manager, you can do so though. The fact that it's a multi-job is detected on the server-side and the right "local resource manager" Multi is used automatically.

[Note]Note

In this multi-job description the sub-jobs are submitted to the default local resource manager. If you want them to be submitted to a non-default local resource manager you'll have to specify that in an additional ReferenceParameters element in the factoryEndpoint element of each sub-job. See here for more information about this.

A multijob resource is created by the factory and exposes a set of WSRF resource properties different than the resource properties of an executable job. The state machine of a multijob is also different since the multijob represents the overall execution of all the executable jobs it is composed of.

4. Submitting jobs with metascheduling functionality

Check GT 4.2.0 GridWay: User's Guide if you are looking for information on metascheduling functionality.