GASS: File Access Function Reference

GASS File Access Overview

 The file access API defines globus_gass_open(), globus_gass_close(), globus_gass_fopen(), and globus_gass_fclose() calls.  These are replacements for the Unix open(), and close() calls, and the C fopen(), and fclose() calls. They handle the three file access patterns listed earlier.  A file opened and closed with these calls can be read and written with ordinary Unix I/O calls.  Hence, a program can be modified to operate in a wide-area environment simply by modifying their program to use these calls in place of open(), close(), fopen(), and fclose().

Note that in the future we may use techniques such as those used in Condor of UFO to replace calls to open/close etc. with the corresponding CASS calls, thereby avoiding the need for source code modifications altogether.  However, such modifications are not planned for the short term.

A data structure called a "tag list" is associated with each cache entry, which provides a soft of reference counting on the cache entry.  When a computation opens a file, a tag containing the job's name is added to the tag list of the cache entry; when the computation closes the file, the tag is deleted, and if there are no more tags for this cache entry then the cache file is also deleted.  This approach allows a Globus Resource Allocation Manager (GRAM) to clean up after computations that terminate abnormally, by removing any extraneous files from the cache.

GASS File Access API Limitations

 The operations which can be done to GASS-opened file descriptors are a subset of those available with general file descriptors.

GASS File Activation and Deactivation

 GASS uses standard Globus module activation and deactivation. Before any GASS functions are called, the following function must be called:

globus_module_activate(GLOBUS_GASS_FILE_MODULE);

This function returns GLOBUS_SUCCESS if GASS was successfully initialized, and you are therefore allowed to subsequently call GASS functions. Otherwise, and error code is returned, and GASS functions should not be subsequently called. This function may be called multiple times.

To deactivate GASS, the following function must be called:

globus_module_deactivate(GLOBUS_GASS_FILE_MODULE);

This function should be called once for each time GASS was activated.

GASS File Access Functions

 int
globus_gass_open
(const char *file,
                             int oflag,
                             /* int mode */ ...)

Same arguments, return value, and semantics as open(),except that the file name argument may be a URL, in which case the following additional operations are performed depending on the oflag:

O_RDONLY

Lookup the file in the local cache. If it does not exist, then copy the file from the remote server into local cache. Add a tag to the cache entry's tag list. Open the local file for read-only access. When the file is closed, remove the tag from cache entry's tag list, and delete the cache entry if the tag list is empty.

O_WRONLY | O_TRUNC

Lookup the file in the local cache. If it exists, then truncate it. Otherwise, create an empty file in the local cache. Add a tag to the cache entry's tag list. Open the local file for write-only access. When the file is closed, copy it from the local cache to the remote server, remove the tag from cache entry's tag list, and delete the cache entry if the tag list is empty.

O_WRONLY

Lookup the file in the local cache. If it does not exist, then copy the file from the remote server into local cache. Add a tag to the cache entry's tag list. Open the local file for write-only access. When the file is closed, copy it from the local cache to the remote server, remove the tag from cache entry's tag list, and delete the cache entry if the tag list is empty.

O_RDWR

Lookup the file in the local cache. If it does not exist, then copy the file from the remote server into local cache. Add a tag to the cache entry's tag list. Open the local file for read-write access. When the file is closed, copy it from the local cache to the remote server, remove the tag from cache entry's tag list, and delete the cache entry if the tag list is empty.

O_RDWR | O_TRUNC

Lookup the file in the local cache. If it exists, then truncate it. Otherwise, create an empty file in the local cache. Add a tag to the cache entry's tag list. Open the local file for read-write access. When the file is closed, copy it from the local cache to the remote server, remove the tag from cache entry's tag list, and delete the cache entry if the tag list is empty.

O_APPEND

O_WRONLY | O_APPEND

Create a socket connection to the server, so that data written to the file descriptor will immediately be sent to the server and written to the file. When the file is closed, the socket is closed. If O_TRUNC is also used, the remote file to which the data is appended will be truncated prior to writing. Else, data is appended to the end of the file. Only x-gass URLs are allowed for these oflag values. lseek() will not work on files opened in this mode.

O_APPEND | O_TRUNC

O_WRONLY | O_TRUNC | O_APPEND

O_RDWR | O_APPEND

O_RDWR | O_APPEND | O_TRUNC

This is not allowed on URLs.

 

The following oflag options are ignored for URLs:

O_CREAT, O_NDELAY, O_NONBLOCK, O_DSYNC, O_RSYNC, O_SYNC, O_NOCTTY, O_EXCL

All other oflag options and combinations will fail with a return value of <0.

If the file is an URL, the mode argument is ignored. The permissions of the remote file are dependent upon the server.

If the file is an URL and the O_WRONLY or O_RDWR oflag is used, then the remote file pointed to by the URL is created if it does not already exist, regardless of whether or not the O_CREAT oflag is used.

The call returns only after remote data has been copied into the local cache, or an empty local file has been created, depending on the oflag value.

This call adds the value contained in the GRAM_JOB_CONTACT environment variable to the cache entry's tag list.

Only x-gass URLs (not ftp) can be used with an O_APPEND oflag.

If the file is an URL, changes to the file are only flushed to the remote server when globus_gass_open() is called. If the process terminates before calling globus_gass_open(), the modified local cache file will remain in the cache until globus_gass_cache_cleanup_tag() is called with the same tag as was used by the globus_gass_open() (e.g. by the GRAM job manager).

Note: Currently only x-gass and ftp URLs are supported.


int
globus_gass_close
(int fd)

Same arguments, return value, and semantics as close(), except that for files opened using an URL, the actions described above in globus_gass_open() are performed.

The call returns only when any remote copy has completed.

This call deletes the value contained in the GRAM_JOB_CONTACT environment variable from the cache entry's tag list with, and deletes the file if this cache entry's tag list is empty.

 


FILE *
globus_gass_fopen
(const char *file,
                              const char *type)

Same arguments, return value, and semantics as fopen(), but extends fopen() as globus_gass_open() extends open(). The type argument maps to globus_gass_open() oflag values as follows:

"r", "rb" -> O_RDONLY

"w", "wb" -> O_WRONLY | O_CREAT | O_TRUNC

"a", "ab" -> O_WRONLY | O_APPEND | O_CREAT

"r+", "r+b", "rb+" -> O_RDWR

"w+", "w+b", "wb+" -> O_RDWR | O_TRUNC | O_CREAT

"a+", "a+b", "ab+" -> Not allowed on URLs.

All other type values cause this function to fail when file is an URL.

 


int
globus_gass_fclose
(FILE *fp)

Same arguments, return value, and semantics as fclose(), but extends fclose() as globus_gass_open() extends close().