Table of Contents
![]() | Important |
|---|---|
This feature is only available starting with GT 4.0.5. |
WS-GRAM includes mechanisms to provide access to audit and accounting information associated with jobs that WS-GRAM submits to a local resource manager (LRM) such as PBS, LSF, or Condor.
![]() | Note |
|---|---|
Remember, GRAM is not a local resource manager but rather a protocol engine for communicating with a range of different local resource managers using a standard message format. |
In some scenarios, it is desirable to get general information about the usage of the underlying LRM, such as:
What kinds of jobs were submitted via GRAM?
How long did the processing of a job take?
How many jobs were submitted by user X?
The following three use cases give a better overview of the meaning and purpose of auditing and accounting:
Group Access. A grid resource provider allows a remote service (e.g., a gateway or portal) to submit jobs on behalf of multiple users. The grid resource provider only obtains information about the identity of the remote submitting service and thus does not know the identity of the users for which the grid jobs are submitted. This group access is allowed under the condition that the remote service stores audit information so that, if and when needed, the grid resource provider can request and obtain information to track a specific job back to an individual user.
Query Job Accounting. A client that submits a job needs to be able to obtain, after the job has completed, information about the resources consumed by that job. In portal and gateway environments where many users submit many jobs against a single allocation, this per-job accounting information is needed soon after the job completes so that client-side accounting can be updated. Accounting information is sensitive and thus should only be released to authorized parties.
Auditing. In a distributed multi-site environment, it can be necessary to investigate various forms of suspected intrusion and abuse. In such cases, we may need to access an audit trail of the actions performed by a service. When accessing this audit trail, it will frequently be important to be able to relate specific actions to the user.
Audit logging in WS-GRAM is done 3 times in a job's lifecycle: When the processing starts, when the job is submitted to the local resource manager and when it's fully processed or when it fails.
While audit and accounting records may be generated and stored by different entities in different contexts, we make the following assumptions in this chapter:
| Audit Records | Accounting Records | |
|---|---|---|
| Generated by: | GRAM service | LRM to which the GRAM service submits jobs |
| Stored in: | Database, indexed by GJID | LRM, indexed by JID |
| Data that is stored: | See list below. | May include all information about the duration and resource-usage of a job |
The audit record of each job contains the following data:
job_grid_id: String representation of the resource EPR
local_job_id: Job/process id generated by the scheduler
subject_name: Distinguished name (DN) of the user
username: Local username
idempotence_id: Job id generated on the client-side
creation_time: Date when the job resource is created
queued_time: Date when the job is submitted to the scheduler
stage_in_grid_id: String representation of the stageIn-EPR (RFT)
stage_out_grid_id: String representation of the stageOut-EPR (RFT)
clean_up_grid_id: String representation of the cleanUp-EPR (RFT)
globus_toolkit_version: Version of the server-side GT
resource_manager_type: Type of the resource manager (Fork, Condor, ...)
job_description: Complete job description document
success_flag: Flag that shows whether the job failed or finished successfully
finished_flag: Flag that shows whether the job is already fully processed or still in progress
The WS-GRAM service returns an EPR that is used to control the job. However, the EPR is an XML document and cannot effectively be used as a primary key for a database table. Therefore, the job's EPR needs to be converted to an acceptable GJID format.
Beginning with GT 4.0.5, a utility class EPRUtil.java
is available to both the GRAM
service before storing the audit record and the GRAM client before
getting audit information from the audit database.
To connect the two sets of records, both audit and accounting, we require that GRAM records the JID in each audit record that it generates. It is then straightforward for an audit service to respond to requests such as "Give me the charge of the job with JID x" by:
first selecting matching record(s) from the audit table,
then using the local JID(s) to join to the accounting table of the LRM and access relevant accounting record(s).
We propose a Web Service interface for accessing audit and accounting information. OGSA-DAI is a WSRF service that can create a single virtual database from two or more remote databases. In the future, other per-job information such as job performance data could be stored using the GJID or local JID as an index, and then made available in the same virtual database.
The rest of this chapter focuses on how to configure WS-GRAM to enable Audit-Logging. A case study for TeraGrid can be read here, which also includes more information about how to use this data to get accounting information of a job, query the audit database for information via a Web Services interface, etc.
Configuration depends on the version of WS-GRAM being used. Independent from the version though is, that audit records are stored in a database and that audit logging is disabled by default.
Audit records are stored in a database which must be set up once.
The following describes how to set up the audit database in MySQL:
Create a database inside of MySQL
Grant necessary privileges to the account that will be used to upload the audit records in the audit. Typically the "globus" account.
Use the schema to create the table
host:~ feller$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 16
Server version: 5.0.37 MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create database auditDatabase;
Query OK, 1 row affected (0.09 sec)
mysql> GRANT ALL ON auditDatabase.* to globus@localhost identified by "foo";
Query OK, 0 rows affected (0.32 sec)
mysql> exit
Bye
host:~ feller$ mysql -u globus -p auditDatabase < ${GLOBUS_LOCATION}/share/gram-service/gram_audit_schema_mysql.sql
Enter password:
host:~ feller$The following describes how to set up the audit database in PostgreSQL:
Create a database inside of PostgreSQL
Grant necessary privileges to the account that will be used to upload the audit records in the audit. Typically the "globus" account.
Use the schema to create the table:
# Connect as postgres admin create database auditDatabase\g create user gt4auditload with encrypted password '<password1>'\g create user gt4auditview with encrypted password '<password2>'\g \c auditDatabase \i gram_audit_schema_postgres-8.0.sql grant insert on gram_audit_table to gt4auditload\g grant select on gram_audit_table to gt4auditview\g \q
You must also update pg_hba.conf to allow
connections from container host (pg_hba.conf
configures client authentication and is
stored in the database cluster's data directory):
hostssl auditDatabase gt4auditload <containerhostip> 255.255.255.255 md5 host auditDatabase gt4auditload <containerhostip> 255.255.255.255 md5 hostssl auditDatabase gt4auditview <containerhostip> 255.255.255.255 md5 host auditDatabase gt4auditview <containerhostip> 255.255.255.255 md5
To turn on Audit Logging, follow these steps:
Add the following lines to the Log4j configuration in
$GLOBUS_LOCATION/container-log4j.properties
to enable audit logging:
# GRAM AUDIT log4j.category.org.globus.exec.service.exec.StateMachine.audit=DEBUG, AUDIT log4j.appender.AUDIT=org.globus.exec.utils.audit.AuditDatabaseAppender log4j.appender.AUDIT.layout=org.apache.log4j.PatternLayout log4j.additivity.org.globus.exec.service.exec.StateMachine.audit=false
Add or modify the database configuration where the audit records are
stored in $GLOBUS_LOCATION/etc/gram-service/jndi-config.xml
. The following shows an example with MySQL as RDBMS:
<resource name="auditDatabaseConfiguration" type="org.globus.exec.service.utils.AuditDatabaseConfiguration">
<resourceParams>
<parameter>
<name>factory</name>
<value>org.globus.wsrf.jndi.BeanFactory</value>
</parameter>
<parameter>
<name>driverClass</name>
<value>com.mysql.jdbc.Driver</value>
</parameter>
<parameter>
<name>url</name>
<value>jdbc:mysql://<host>[:port]/auditDatabase</value>
</parameter>
<parameter>
<name>user</name>
<value>globus</value>
</parameter>
<parameter>
<name>password</name>
<value>foo</value>
</parameter>
<parameter>
<name>globusVersion</name>
<value>4.0.5</value>
</parameter>
</resourceParams>
</resource>
Audit logging is configured entirely in WS-GRAM's JNDI configuration in
${GLOBUS_LOCATION}/etc/gram-service/jndi-config.xml,
using 2 sections: a general configuration section and a database section:
<resource name="auditConfiguration" type="org.globus.exec.service.exec.utils.audit.AuditConfiguration">
<resourceParams>
....
<parameter>
<name>enableAuditLogging</name>
<value>false</value>
</parameter>
<parameter>
<name>globusVersion</name>
<value>4.0.9</value>
</parameter>
<parameter>
<name>fallbackStorageDirectory</name>
<value>/opt/gt409/share/gram-service/</value>
</parameter>
<parameter>
<name>dbUploadRetryInterval</name>
<value>300</value>
</parameter>
</resourceParams>
</resource>
| Parameter | Explanation |
|---|---|
| enableAuditLogging | true to turn audit
logging on, false to
turn it off
|
| globusVersion | Version of the Globus Toolkit, does not have to be edited |
| fallbackStorageDirectory | If the insert or the update of a record into the database system fails because the database is down or misconfigured, the record is stored as a file in the directory specified by this parameter. This ensures that no records are lost. Periodical attempts to upload the records being stored in this directory into the database are performed by WS-GRAM. Once the upload of a fallback record was successful the record file will be deleted. This directory must be readable and writable for the account that runs the container. |
| dbUploadRetryInterval | Time in seconds after which, periodically, an attempt is made to upload fallback records into the database. |
<resource name="auditDatabase" type="javax.sql.DataSource">
<resourceParams>
....
<parameter>
<name>driverClassName</name>
<value>driver class name</value>
</parameter>
<parameter>
<name>url</name>
<value>db connection url</value>
</parameter>
<parameter>
<name>username</name>
<value>user to access the database</value>
</parameter>
<parameter>
<name>password</name>
<value>password to access the database</value>
</parameter>
</resourceParams>
</resource>We support 2 database systems: MySQL, PostgreSQL. The following table gives an overview which values must be used for the parameters url and driverClassName in the above JNDI configuration for the various db systems. Derby is configured as the default DB system.
| DB system | driverClassName | url |
|---|---|---|
| MySQL | com.mysql.jdbc.Driver | jdbc:mysql://HOST[:PORT]/auditDatabase |
| PostgreSQL | org.postgresql.Driver | jdbc:postgresql://HOST[:PORT]/auditDatabase |
![[Important]](/docbook-images/important.gif)
![[Note]](/docbook-images/note.gif)