Debugging

We'll begin with debugging the underlying Java WS Core and then discuss debugging in GridWay in particular. For information about sys admin logging, see Debugging in the Admin Guide.

1. Debugging in Java WS Core

As GridWay relies on Globus services, it is assumed that a Globus grid infrastructure has been installed and configured. Failures related to Globus services (e.g. GRAM or MDS) can be debugged as described in Debugging.

1.1. Development Logging in Java WS Core

The following information applies to Java WS Core and those services built on it.

Logging in the Java WS Core is based on the Jakarta Commons Logging API. Commons Logging provides a consistent interface for instrumenting source code while at the same time allowing the user to plug-in a different logging implementation. Currently we use Log4j as a logging implementation. Log4j uses a separate configuration file to configure itself. Please see Log4j documentation for details on the configuration file format.

1.1.1. Configuring server side developer logs

Server side logging can be configured in $GLOBUS_LOCATION/container-log4j.properties, when the container is stand alone container. For tomcat level logging, refer to Logging for Tomcat, . The logger log4j.appender.A1 is used for developer logging and by default writes output to the system output. By default it is set for all warnings in the Globus Toolkit package to be displayed.

Additional logging can be enabled for a package by adding a new line to the configuration file. Example:

   #for debug level logging from org.globus.package.FooClass 
   log4j.category.org.globus.package.name.FooClass=DEBUG
   #for warnings from org.some.warn.package
   log4j.category.org.some.warn.package=WARN
   

1.1.2. Configuring client side developer logs

Client side logging can be configured in $GLOBUS_LOCATION/log4j.properties. The logger log4j.appender.A1 is used for developer logging and by default writes output to the system output. By default it is set for all warnings in the Globus Toolkit package to be displayed.

2. Debugging in GridWay

Due to GridWay's architecture, mainly its MADs components, debugging it is not a trivial task. The most obvious way to see what is going on is to monitor what happens in the GridWay log files. Here are the files to look into in case of trouble:

  • $GW_LOCATION/var/gwd.log: This is the general log file, where the gwd daemon logs whatever the Resource, Dispatch, Transfer, Execution, Information managers inform it about.

  • $GW_LOCATION/var/sched.log: The scheduler is a separate process that communicates with the daemon using the standard input/output. It writes log information to this file.

  • $GW_LOCATION/var/<job_id>/job.log: Each job has its own log file, with information regarding its context (input/output files, MADs, resource) and it's life cycle. In this folder also reside the job.template, the job.env with the environment variable and the standard output and error of the wrapper script (stdout.wrapper and stderr.wrapper)

In order to get the maximum amount of debug information in the gwd.log file (especially more information about what the MADs are doing), you should compile GridWay with the following configure option:

./configure --enable-debug

If there is a problem with GridWay that makes any MAD crash, it will be useful to have a coredump. To tell the MADs that they should write to a coredump file when they crash, use the following environment variable before you execute your first job:

export MADDEBUG=yes

Sometimes it is the daemon (the gwd process) that crashes. In order to obtain a coredump of the daemon, run the following command before executing the daemon:

ulimit -c unlimited

The coredump file will be written to the $GW_LOCATION/var directory, with a filename corresponding to the process PID, that is,

$GW_LOCATION/var/core.<process_pid>

If you cannot figure out what is wrong, you can always use the mailing list gridway-user to get support. Please provide a detailed explanation of your problem so the community can reproduce it and give advice. Also, send along:

  • $GW_LOCATION/var/gwd.log

  • $GW_LOCATION/var/sched.log

  • $GW_LOCATION/var/<job_id>/{job.log,stderr.wrapper}: If relevant. The stderr.wrapper file is specially useful for debugging; it shows step by step the wrapper script being executed.