Software Links
Getting Started
- A Globus Primer
- Globus Is Modular!
- Quickstart
- Installing GT
- Platform Notes
- GT Developer's Guide
- GT User's Guide
- Migrating Guides
Reference
Manuals
Common Runtime
Security
- GSI C
- GSI Java
- Java WS A&A
- C WS A&A (coming soon)
- CAS
- Delegation Service
- MyProxy
- GSI-OpenSSH
- SimpleCA
Data Mgt
WS MDS
Execution Mgt
Table of Contents
- 1. Delegating credentials
- 2. Local resource managers interfaced by a GRAM4 installation
- 3. Submitting Jobs Specified in JDD
- 3.1. Simple interactive job
- 3.2. Streaming output
- 3.3. Using a contact string
- 3.4. Using a job description
- 3.5. Using a contact string in the job description
- 3.6. Specifying a local resource manager
- 3.7. Job with staging
- 3.8. Specifying a local user id in the job description
- 3.9. Using substitution variables
- 3.10. Using custom job description extensions
- 3.11. Multi-Job
- 4. Submitting jobs with metascheduling functionality
There are three different uses of delegated credentials:
- for use by the MEJS to create a remote user proxy
- for use by the MEJS to contact RFT
- for use by RFT to contact the GridFTP servers. The EPRs to each of these are specified in three job description elements -- they are jobCredentialEndpoint, stagingCredentialEndpoint, and transferCredentialEndpoint respectively. Please Job Description Schema Reference and RFT transfer request schema documentation for more details about these elements.
The globusrun-ws client can either delegate
these credentials automatically for a particular job, or it can reuse
pre-delegated credentials (see next paragraph) through the use of command-line
arguments for specifying the credentials' EPR files. Please see the
GRAM4 Commands for details on these command-line arguments.
It is possible to use delegation command-line tools to obtain and refresh delegated credentials in order to use them when submitting jobs to GRAM4. This, for instance, enables the submission of many jobs using a shared set of delegated credentials. This can significantly decrease the number of remote calls for a set of jobs, thus improving performance.
The following example shows how to delegate credentials.
globus-credential-delegate delegates to the specified
delegation factory on lucky0.mcs.anl.gov, prints some information and stores the
endpoint reference of the delegated credentials into the file delegCred.epr
[martin@osg-test1 ~]$ globus-credential-delegate \ > -s https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationFactoryService \ > delegCred.epr Delegated credential EPR: Address: https://lucky0.mcs.anl.gov:8443/wsrf/services/DelegationService Reference property[0]: <ns1:DelegationKey xmlns:ns1="http://www.globus.org/08/2004/delegationService"> 55e2a450-58be-11dd-b83c-e4ec640dfe13 </ns1:DelegationKey>
To destroy the delegated credential use wsrf-destroy:
[martin@osg-test1 jobs]$ wsrf-destroy -e delegCred.epr Destroy operation was successful
For more information about the delegation command-line tools see Command-line tools
A GRAM4 instance can interface to more than one local resource manager (LRM), as shown in the previous section. A user can explicitly specify what LRM should be used for a job. But in a larger Grid it might be confusing for users to remember which LRM's are available on which machines.
That's why GRAM4 configures a default local resource manager, which is used for job submission if the client didn't explicitly specify one.
You can check the resource property
availableLocalResourceManagers
of a GRAM4 factory service to get that information. Replace host and port in
the below example to query against other containers:
[martin@osg-test1 ~]$ globus-wsrf-get-property \
-s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
"{http://www.globus.org/namespaces/2008/03/gram/job}availableLocalResourceManagers"The result on that machine is (formatted for better readability) shows that the local resource managers Fork, Multi, Condor and PBS are available:
<ns1:availableLocalResourceManagers
xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
<ns1:localResourceManager>Fork</ns1:localResourceManager>
<ns1:localResourceManager>Multi</ns1:localResourceManager>
<ns1:localResourceManager>Condor</ns1:localResourceManager>
<ns1:localResourceManager>PBS</ns1:localResourceManager>
</ns1:availableLocalResourceManagers>A more typical result in a production environment is probably Fork, Multi and just one additional LRM like Condor, PBS or LSF.
You can check the resource property
defaultLocalResourceManagers
of a GRAM4 factory service to get that information. Replace host and port in
the below example to query against other containers:
[martin@osg-test1 ~]$ globus-wsrf-get-property \
-s https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \
"{http://www.globus.org/namespaces/2008/03/gram/job}localResourceManager"The result on that machine shows that PBS is the default local resource managers:
<ns1:localResourceManager xmlns:ns1="http://www.globus.org/namespaces/2008/03/gram/job">
PBS
</ns1:localResourceManager>
Use the globusrun-ws program to submit a
simple job without writing a job description document. Use the -c argument,
a job description will be generated assuming the first arg is the executable
and the remaining are arguments. For example:
% globusrun-ws -submit -c /bin/touch touched_it Submitting job...Done. Job ID: uuid:4a92c06c-b371-11d9-9601-0002a5ad41e5 Termination time: 04/23/2005 20:58 GMT Current job state: Active Current job state: CleanUp Current job state: Done Destroying job...Done.
Confirm on the server-side that the job worked by verifying the file was touched:
% ls -l ~/touched_it -rw-r--r-- 1 smartin globdev 0 Apr 22 15:59 /home/smartin/touched_it % date Fri Apr 22 15:59:20 CDT 2005
Note: You did not tell globusrun-ws where to run your job, so the default of localhost was used.
Also note, that globusrun-ws destroyed the job after it was fully processed.
We call this kind of job interactive, because globusrun-ws does not return after submission. It subscribes for status update notifications of the job and informs the user about a status change as soon as it changes. Once it gets the information the the job has been fully processed it destroys the job, which means that internal state belonging to the job is cleaned up on the server-side.
A user can request that the output of the program is sent back directly to
the client as soon as it's available. This is useful if a user does not
want to do additional file staging for a quick job. To enable this, specify
the -s option.
[martin@osg-test1 ~]$ globusrun-ws -submit \
-F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \
-s -c /bin/echo hello world!
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:1731f602-22fe-11dd-879c-0013d4c3b957
Termination time: 05/16/3008 04:10 GMT
Current job state: Active
Current job state: CleanUp-Hold
hello world!
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.Note that a GridFTP server must be running on the remote machine (lucky0) to enable streaming.
Note that streaming output adds some overhead to the submission and will probably be significantly slower compared to a job without streaming. An alternative to streaming is to use staging to transport the output of the executable back to the client. This however requires that a GridFTP server is running on the client machine.
Use globusrun-ws to submit the same touch job, but this time tell globusrun-ws to run the job on another machine (lucky0.mcs.anl.gov:8443). A GT4 server with GRAM4 installed must run on that machine and listen on port 8443.
% globusrun-ws -submit \ -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService \ -c /bin/touch touched_it Submitting job...Done. Job ID: uuid:3050ad64-b375-11d9-be11-0002a5ad41e5 Termination time: 04/23/2005 21:26 GMT Current job state: Active Current job state: CleanUp Current job state: Done Destroying job...Done.
Type globusrun-ws -help to learn the details about the contact string.
The specification of a job to submit is to be written by the user in a job description XML file.
Here is an example of a simple job description:
<job>
<executable>/bin/echo</executable>
<argument>this is an example_string </argument>
<argument>Globus was here</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>Tell globusrun-ws to read the job description from a file, using the -f argument:
% bin/globusrun-ws -submit -f simple.xml
Submitting job...Done.
Job ID: uuid:c51fe35a-4fa3-11d9-9cfc-000874404099
Termination time: 12/17/2004 20:47 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Note the usage of the substitution variable ${GLOBUS_USER_HOME}
which resolves to the user home directory.
Here is an example with more job description parameters:
<?xml version="1.0" encoding="UTF-8"?>
<job>
<executable>/bin/echo</executable>
<directory>/tmp</directory>
<argument>12</argument>
<argument>abc</argument>
<argument>34</argument>
<argument>this is an example_string </argument>
<argument>Globus was here</argument>
<environment>
<name>PI</name>
<value>3.141</value>
</environment>
<stdin>/dev/null</stdin>
<stdout>stdout</stdout>
<stderr>stderr</stderr>
<count>2</count>
</job>Note that in this example, a <directory> element specifies the current directory for the execution
of the command on the execution machine to be /tmp, and the standard output is
specified as the relative path stdout. The output is therefore written to /tmp/stdout:
% cat /tmp/stdout
12 abc 34 this is an example_string Globus was here
Instead of specifying the contact string on the command-line, you can also put it in the job description:
<job xmlns:wsa="http://www.w3.org/2005/08/addressing">
<factoryEndpoint>
<wsa:Address>
https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
</wsa:Address>
</factoryEndpoint>
<executable>/bin/date</executable>
</job>Submit the job with the following command (assuming the above description has been stored in the file job.xml):
% bin/globusrun-ws -submit -f job.xml
![]() | Note |
|---|---|
This time you don't have to specify the -F option. |
Note that at this point you didn't specify any local resource manager related information. If a user does not specify anything then the job is run by the default local resource manager, that is defined on the server-side. If an admin e.g. configured Condor as default local resource manager, then the jobs submitted so far will be managed by Condor on the server-side.
Check the section Local resource managers interfaced by a GRAM4 installation to find out which local resource managers are available in a GRAM4 installation and which one is configured as the default.
As said, if you want to submit a job to the default local resource manager, all you have to do is to just NOT specify any local resource manager in your submission, neither in the job description, nor on the command-line. The above examples show how to do it.
If you want to submit a job to a non-default local resource manager, or if you just want to be explicit in what you specify, you'll have to specify the local resource manager in your submission. Using globusrun-ws, there are two ways to specify a local resource manager:
- as command-line argument of globusrun-ws
(
-Ft <lrm>) - in the factoryEndpoint element in the job description
Example: the following job will be submitted to Condor:
globusrun-ws -submit \ -F osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService \ -Ft Condor \ -c /bin/date
Or with a job description that contains a factoryEndpoint:
<job xmlns:wsa="http://www.w3.org/2005/08/addressing"
xmlns:gram="http://www.globus.org/namespaces/2008/03/gram/job">
<factoryEndpoint>
<wsa:Address>
https://osg-test1.unl.edu:8443/wsrf/services/ManagedJobFactoryService
</wsa:Address>
<wsa:ReferenceParameters>
<gram:ResourceID>Condor</gram:ResourceID>
</wsa:ReferenceParameters>
</factoryEndpoint>
<executable>/bin/date</executable>
</job>Submit that job (assuming the description is stored in the file myJob.xml):
globusrun-ws -submit -f myJob.xml
In order to do file staging one must add specific elements to the job description and delegate credentials appropriately (see Section 2, “Delegating credentials”). The file transfer directives follow the RFT syntax, which allows only for third-party transfers. Each file transfer must therefore specify a source URL and a destination URL. URLs are specified as GridFTP URLs (for remote files) or as file URLs (for files local to the service--these are converted internally to full GridFTP URLs by the service).
For instance, in the case of staging a file in, the source
URL would be a GridFTP URL (for instance
gsiftp://job.submitting.host:2811/tmp/mySourceFile
) resolving to a source document accessible on the file system
of the job submission machine (for instance /tmp/mySourceFile
). At run-time the Reliable File Transfer service used by the
MEJS on the remote machine would reliably fetch the remote file using the
GridFTP protocol and write it to the specified local file (for instance
file:///${GLOBUS_USER_HOME}/my_transfered_file,
which resolves to ~/my_transfered_file). Here
is how the stage-in directive would look like:
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://job.submitting.host:2811/tmp/mySourceFile</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/my_transfered_file</destinationUrl>
</transfer>
</fileStageIn>
Note: additional RFT-defined quality of service requirements can be specified for each transfer. See the RFT documentation for more information.
Here is an example job description with file stage-in and stage-out:
<job>
<executable>my_echo</executable>
<directory>${GLOBUS_USER_HOME}</directory>
<argument>Hello</argument>
<argument>World!</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://job.submitting.host:2811/bin/echo</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:///${GLOBUS_USER_HOME}/stdout</sourceUrl>
<destinationUrl>gsiftp://job.submitting.host:2811/tmp/stdout</destinationUrl>
</transfer>
</fileStageOut>
<fileCleanUp>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/my_echo</file>
</deletion>
</fileCleanUp>
</job>Note that the job description XML does not need to include a reference to the schema that describes its syntax. As a matter of fact it is possible to omit the namespace in the GRAM job description XML elements as well. The submission of this job to the GRAM services causes the following sequence of actions:
- The
/bin/echoexecutable is transfered from the submission machine to the GRAM host file system. The destination location is the HOME directory of the user on behalf of whom the job is executed by the GRAM services (see<fileStageIn>). - The transfered executable is used to print a test string
(see
<executable>,<directory>and the<argument>elements) on the standard output, which is redirected to a local file (see<stdout>). - The standard output file is transfered to the submission machine
(see
<fileStageOut>). - The file that was initially transfered during the stage-in phase is removed
from the file system of the GRAM installation (see
<fileCleanup>).
Submit that job (assuming the description is stored in the file myJob.xml):
globusrun-ws -submit -S -f myJob.xml
The flag -S tells globusrun-ws to delegate
credentials so that Gram4 can call the file transfer service RFT on behalf of
the submitting user, and that RFT can interact with the gridftp servers on
behalf of the submitting user.
If you already delegated credentials (see Delegating credentials for how to delegate a credential) and have an endpoint reference of that delegated credentials stored in the file delegCred.epr and want them to be used for the transfers instead of globusrun-ws delegating new credentials, you can tell globusrun-ws to use your credentials:
globusrun-ws -submit -Sf delegCred.epr -Tf delegCred.epr -f myJob.xml
The -Sf flag tells that the specified credential is
to be used by Gram4 to call RFT on behalf of the user, and the
-Tf flag tells that the specified credential is
to be used by RFT to interact with the GridFTP servers.
If a user has more than one user account on a server and the distinguished name (DN) of the user's certificate is mapped to all these user accounts, a user can specify which local account should be used by GRAM4 for the job submission. By default the first local user account that is defined is used for job submission. If this is not the one that should be used the user must explicitly specify the account to be used. The following dummy job description shows how to do this:
<job>
<localUserId>stu</localUserId>
<executable>/bin/date</executable>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>To allow for customization of values, such as paths, on a per-job basis; a job description substitution variable named "GLOBUS_JOB_ID" can be used.
For example:
<job>
<executable>/bin/date</executable>
<stdout>/tmp/stdout.${GLOBUS_JOB_ID}</stdout>
<stderr>/tmp/stderr.${GLOBUS_JOB_ID}</stderr>
<fileStageOut>
<transfer>
<sourceUrl>file:///tmp/stdout.${GLOBUS_JOB_ID}</sourceUrl>
<destinationUrl>gsiftp://mymachine.mydomain.com/out.${GLOBUS_JOB_ID}</destinationUrl>
</transfer>
</fileStageOut>
</job>More information about substitution variables can found here.
Basic support is provided for specifying custom extensions to the job description. There are plans to improve the usability of this feature, but at this time it involves a bit of work.
Specifying the actual custom elements in the job description is trivial. Simply add any elements that you need between the beginning and ending
extensions tags at the bottom of the job
description as in the following basic example:
<job>
<executable>/home/user1/myapp</executable>
<extensions>
<myData>
<flag1>on</flag1>
<flag2>off</flag2>
</myData>
</extensions>
</job>
To handle this data, you will have to alter the appropriate Perl scheduler
script (i.e. $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/fork.pm for the
Fork scheduler, etc...) to parse the data returned from the
$description->extensions() sub.
For more information about extensions see the Extensions section.
The job description XML schema allows for specification of a multijob i.e. a job that is itself composed of several executable jobs, which we will refer to as subjobs (note: subjobs cannot be multijobs, so the structure is not recursive). This is useful for instance in order to bundle a group of jobs together and submit them as a whole to a remote GRAM installation.
Note that no relationship can be specified between the subjobs of a multijob. The subjobs are submitted to job factory services in their order of appearance in the multijob description.
Within a multijob description, each subjob description must come along with an endpoint for the factory to submit the subjob to. This enables the at-once submission of several jobs to different hosts. The factory to which the multijob is submitted acts as an intermediary tier between the client and the eventual executable job factories.
Here is an example of a multijob description:
<?xml version="1.0" encoding="UTF-8"?>
<multiJob xmlns:wsa="http://www.w3.org/2005/08/addressing">
<factoryEndpoint>
<wsa:Address>
https://localhost:8443/wsrf/services/ManagedJobFactoryService
</wsa:Address>
</factoryEndpoint>
<directory>${GLOBUS_LOCATION}</directory>
<count>1</count>
<job>
<factoryEndpoint>
<wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
</factoryEndpoint>
<executable>/bin/date</executable>
<stdout>${GLOBUS_USER_HOME}/stdout.p1</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr.p1</stderr>
<count>2</count>
</job>
<job>
<factoryEndpoint>
<wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
</factoryEndpoint>
<executable>/bin/echo</executable>
<argument>Hello World!</argument>
<stdout>${GLOBUS_USER_HOME}/stdout.p2</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr.p2</stderr>
<count>1</count>
</job>
</multiJob>Submit the multi-job with the following command:
% bin/globusrun-ws -submit -f test_multi.xml
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:bd9cd634-4fc0-11d9-9ee1-000874404099
Termination time: 12/18/2004 00:15 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.![]() | Note |
|---|---|
When you submit a multi-job you don't have to specify the local resource manager, you can do so though. The fact that it's a multi-job is detected on the server-side and the right "local resource manager" Multi is used automatically. |
![]() | Note |
|---|---|
In this multi-job description the sub-jobs are submitted to the default local resource manager. If you want them to be submitted to a non-default local resource manager you'll have to specify that in an additional ReferenceParameters element in the factoryEndpoint element of each sub-job. See here for more information about this. |
A multijob resource is created by the factory and exposes a set of WSRF resource properties different than the resource properties of an executable job. The state machine of a multijob is also different since the multijob represents the overall execution of all the executable jobs it is composed of.
Check GT 4.2.0 GridWay: User's Guide if you are looking for information on metascheduling functionality.
![[Note]](/docbook-images/note.gif)