GT 4.1.3 Release Notes: GRAM4


1. Component Overview

The Web Services Grid Resource Allocation and Management (GRAM4) component comprises a set of WSRF-compliant Web services to locate, submit, monitor, and cancel jobs on Grid computing resources. GRAM4 is not a job scheduler, but rather a set of services and clients for communicating with a range of different batch/cluster job schedulers using a common protocol. GRAM4 is meant to address a range of jobs where reliable operation, stateful monitoring, credential management, and file staging are important.

The GRAM server is typically deployed in conjunction with the Delegation and RFT services to address data staging, delegation of proxy credentials, and computation monitoring and management in an integrated manner.

2. Feature Summary

New Features new since 4.0.5

  • Support for mpich-g2 jobs:

    • multi-job submission capabilities
    • ability to coordinate processes in a job
    • ability to coordinate subjobs in a multi-job

  • Publishing of the job's exit code
  • The ability to select the account under which the remote job will be run. If a user's grid credential is mapped to multiple accounts, then the user can specify, in the RSL, under which account the job should be run.
  • Optional client-specified hold on a state. Released with the new "release" operation.

Other Supported Features

  • Remote job execution and management
  • Uniform and flexible interface to batch scheduling systems
  • File staging before and after job execution
  • File / directory clean up after job execution (after file stage out)

Deprecated Features

  • managed-job-globusrun has been replaced by globusrun-ws.
  • Service managed data streaming of job's stdout/err during execution.
  • File staging using the GASS protocol
  • File caching of stages files, e.g. GASS Cache

3. Changes Summary

The following changes have occurred for GRAM4 since the last stable release, 4.0.5:

[summarize changes]

4. Bug Fixes

  • The following olink lists all of the bugs resolved for GRAM4 since GT 4.0.5: Resolved Bugs

5. Known Problems

  • The following olink lists all of the bugs or enhancements known at the time for the 4.1.3 release: Known Bugs
  • Recoverability for jobs which employ any staging directives (i.e. fileStageIn, fileStageOut, and fileCleanUp) is not working. Fixes have been committed to a branch and will be included with the 4.0.1 point release in a month or so.
  • Current NEW, ASSIGNED, and REOPENED Bugzilla entries for WS-GRAM.

    A special attention for bug 5048: In case that deadlock occurs during container startup, remove all persisted data from the persistence database, or simply retry starting the container; the frequency of its occurence varies between machines and environments, on some machines it might not occur at all. We hope to be able to fix that soon.

6. Technology Dependencies

GRAM depends on the following GT components:

  • Java WS Core
  • Transport-Level Security
  • Delegation Service
  • RFT
  • GridFTP
  • MDS - internal libraries

Other scheduler adapters available for GT 4.1.3 release:

The XML::Parser Perl module is required for job description extension support.

7. Tested Platforms

Tested platforms for GRAM4:

  • Linux

    • Fedora Core 1 i686
    • Fedora Core 3 i686
    • Fedora Core 3 yup xeon
    • RedHat 7.3 i686
    • RedHat 9 x86
    • Debian Sarge x86
    • Debian 3.1 i686

Tested containers for GRAM4:

  • Java WS Core container
  • Tomcat 4.1.31

8. Backward Compatibility Summary

Protocol changes since GT version 4.0.5:

  • The protocol has been changed to be WSRF compliant. There is no backward compatibility between this version and any previous versions.

9. Associated Standards

GRAM4 does not currently have any associated standards.

10. For More Information

See GRAM4 for more information about this component.