GT 3.9.4 Component Guide to Public Interfaces: WS MDS Aggregator

Semantics and syntax of APIs

Programming Model Overview

The aggregator module consists of an Aggregating ServiceGroup framework which supports plugins as detailed below, as well as a number of standard plugins.

The Aggregating ServiceGroup framework

The aggregating servicegroup framework is designed to facilitate the collecting of information from or about WS-Resources (via plugin aggregator sources) and the feeding of that information to plugin aggregator sinks.

The framework provides for over-the-wire management of the list of registered resources (through a WS-ServiceGroup interface) and a Java API for connecting sources and sinks together.

In general (although this is not a hard requirement), aggregator sinks will be tied into a specific service implementation, whilst aggregator sources are more independent. (For example, the trigger and archive services act as sinks)

The standard plugins

A number of standard aggregator sources are provided, which implement the aggregator source API. These provide for collecting information from/about a WS-Resource by:

  • WS-ResourceProperties poll operations
  • WS-Notfication subscription
  • Execution of arbitrary executables

Component API

There are two main Java interfaces in the aggregator.

  • AggregatorSink - which is implemented by sinks that can receive data from the aggregator framework.
  • AggregatorSource - which is implemented by sources that can feed data into the aggregator framework.

Semantics and syntax of the WSDL

Protocol overview

The aggregator builds on the WS-ServiceGroup and WS-ResourceLifetime specifications. Those specifications should be consulted for details on the syntax of each operation.

Each aggregator is represented as a WS-ServiceGroup (an AggregatorServiceGroup).

Resources may be registered to an AggregatorServiceGroup using the AggregatorServiceGroup Add operation. Each registration will be represented as a ServiceGroupEntry resource (specifically, an AggregatorServiceGroupEntry resource). When a registration is made, the appropriate aggregation source and sink will be informed and aggregated data from the registered resource will begin to flow via the aggregation source into the aggregation sink. The method of collection by source and processing by the sink is dependent on the particular instantiation of the aggregator (see per-source documentation for source information and per-service documentation for sink information)

Operations

Each AggregatorServiceGroup exposes an Add operation. This is used to register a specified resource with the aggregator. In addition to the requirements made by the WS-ServiceGroup specification, the Content element of each registration must be an AggregatorContent type, with the AggregatorConfig element containing configuration information specific to each source and sink (documented elsewhere).

Each AggregatorServiceGroupEntry has a setTerminationTime operation, which can be used to set the termination time of the registration, as detailed in WS-ResourceLifetime.

Resource properties

Each AggregatorServiceGroup exposes an Entry resource property which publishes details of each registered resource, including both an EPR to the resource, the aggregator configuration information, and data from the sink.

In addition, each AggregatorServiceGroup publishes registration load information (the total number of registrations since service startup and decaying averages) in a RegistrationCount resource property.

Faults

[list and briefly describe each fault]

Schema Definition

Other relevant source files are the:

  • WSRF service group schema
  • WSRF resource lifetime schema
  • MDS Usefulrp schema.

Command-line tools

Command-line tool mds-servicegroup-add for WS MDS Aggregator

Tool description

mds-servicegroup-add creates a set of registrations to a WS-ServiceGroup and periodically renews those registrations. It is intended primarily for registering grid resources to MDS services such as the index and trigger services.

Registrations are defined in an XML configuration file, which is documented in the aggregator admin guide.

Command syntax

   mds-servicegroup-add -s http://foo [options] config.xml

Options and Arguments

-s http://foo This dummy option is required but ignored. All end point references used by mds-servicegroup-add are read from the configuration file, not the command line.
-a By default, mds-servicegroup-add will attempt to make an authenticated connection to each service group. This option is used to specify anonymous connections (and to prevent mds-servicegroup-add from failing if you don't have a valid Grid credential).
config.xml the configuration file

The other standard toolkit options [TODO -- link to standard options page] are also supported.

Limitations

It is necessary for the tool to continue to run in order for the registrations that it maintains to be kept alive, as registrations will otherwise time out.

Overview of Graphical User Interface [gui name]

There is no GUI specifically for the aggregator. The release contains a tech preview of WebMDS which can be used to display monitoring information in a web browser. Specifically, it can be directed at services based on the aggregator to display information about resources registered to the aggregator.

Semantics and syntax of domain-specific interface

Execution aggregation source interface introduction

The execution aggregation source provides a way to aggregate data about a registered resource using an arbitrary local executable. The executable will be passed registration information and is expected to output the gathered data, as detailed below.

A basic example of the use of this API is described in the ping test example for the aggregator execution source

Syntax of the execution aggregation source interface

The execution aggregation source will periodically execute an identified executable. The identity of the executable and the frequency with which it is to run are specified in the registration message.

Name of executable

The executable to run will be $GLOBUS_LOCATION/libexec/aggrexec/<scriptname> with scriptname supplied in the registration message.

Input to executable

Information about the registration will be supplied as commandline parameters and on stdin.

A single commandline parameter will be supplied to the executable. This will be the URL from the EPR of the registered service.

Two XML documents will be send to stdin, in sequence. The first document will be the full EPR to the registered service. The second document will be the AggregatorConfig block from the registration message.

Output from executable

The executable must output a well-formed XML document to stdout. This output document will be delivered into the aggregator framework.

Configuration interface

Configuration overview

Configuring an Aggregating Service Group to perform a data aggregation is performed by specifying an AggregatorContent object as the content parameter of a ServiceGroup add method invocation.  An AggregatorContent object is composed of two xsd:any arrays: AggregatorConfig and AggregatorData.

AggregatorConfig is used to specify parameters that are to be passed to the underlying AggregatorSource when the ServiceGroup add method is invoked.  These parameters are generally type-specific to the implementation of the AggregatorSource and/or AggregatorSink being used.

The AggregatorData xsd:any array is used as the storage location for aggregated data that is the result of message deliveries to the AggregatorSink.  Generally, the AggregatorData parameter of the AggregatorContent is not populated when the ServiceGroup add method is invoked, but rather is populated by message delivery from the AggregatorSource.

Syntax of the interface

The basic structure of the AggregatorContent type is defined in the file aggregator-types.xsd, the relevant fragment of which is shown below. In addition, there are per-source and per-sink configuration elements, which should be placed in the AggregatorConfig element of a registration if the appropriate source or sink is being used. These are detailed in a table below.

<xsd:complexType name="AggregatorConfig">
  <annotation><documentation>
    This type encapsulates multiple arbitrary aggregator configuration data
  </documentation></annotation>
  <xsd:sequence>
    <xsd:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AggregatorData">
  <annotation><documentation>
    This type encapsulates multiple arbitrary aggregated content data.
  </documentation></annotation>
  <xsd:sequence>
    <xsd:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AggregatorContent">
  <annotation><documentation>
    This type encapsulates the Aggregator's ServiceGroup content element, 
    which is composed of two xsd:any arrays, one storing the aggregator 
    configuration, the other storing the aggregated data. 
   </documentation></annotation>
  <xsd:sequence>
    <xsd:element name="AggregatorConfig"
                 type="tns:AggregatorConfig" 
                 minOccurs="1" maxOccurs="1"/> 
    <xsd:element name="AggregatorData"
                 type="tns:AggregatorData"
                 minOccurs="1" maxOccurs="1"/> 
    </xsd:sequence>
</xsd:complexType> 

Specifying the Aggregator Source

The aggregation source used to collect data can be changed from default by editing the aggregatorSource parameter in the index configuration in $GLOBUS_LOCATION/etc/globus_wsrf_mds_index/jndi-config.xml:

  <resource name="configuration"
               type="org.globus.mds.index.impl.IndexConfiguration">
    <resourceParams>
      <parameter>
        <name> factory</name>
        <value>org.globus.wsrf.jndi.BeanFactory</value>
      </parameter>
      <parameter>
        <name>aggregatorSource</name>
        <value>org.globus.mds.aggregator.impl.QueryAggregatorSource</value>
      </parameter>
    </resourceParams>

This parameter specifies a java class that will be used to collect data for the index. By default it is set to the QueryAggregatorSource. It can be changed to one of the other sources supplied with the toolkit, or to one installed later. Details of the supplied sources are in the Aggregator developer guide.

Configuring the Aggregator Source

Configuration options are specified by creating a configuration file and running mds-servicegroup-add to perform the registrations specified in that configuration file. The syntax of that file is:
<?xml version="1.0" encoding="UTF-8" ?>
<ServiceGroupRegistrations
  xmlns="http://mds.globus.org/servicegroup/client" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xmlns:agg="http://mds.globus.org/aggregator/types">

   <defaultServiceGroupEPR>
      Default service group EPR
   </defaultServiceGroupEPR>

   <defaultRegistrantEPR>
      Default registrant EPR
   </defaultRegistrantEPR>

   <defaultSecurityDescriptorFile>
      Path name of security descriptor file
   </defaultSecurityDescriptorFile>

   One or more of the following:
   <ServiceGroupRegistrationParameters>
      <ServiceGroupEPR>
         EPR of the service group to register to
      </ServiceGroupEPR>
      <RegistrantEPR>
         EPR of the entity to be monitored.
      </RegistrantEPR>
      <InitialTerminationTime>
         Initial termination time
      </InitialTerminationTime>
      <RefreshIntervalSecs>
         Refresh interval, in seconds
      </RefreshIntervalSecs>
      <Content>
         Aggregator-source-specific configuration parameters
      </Content>
   </ServiceGroupRegistrationParameters>

</ServiceGroupRegistrations>
Each ServiceGroupRegistrationParameters block specifies the parameters used to register a resource to a service group. The parameters specified in this block are:
ServiceGroupEPR The EPR of the service group to register to. This parameter may be omitted if a defaultServiceGroupEPR block is specified; in this case, the value of defaultServiceGroupEPR will be used instead.
RegistrantEPR The EPR of the resource to register. This parameter may be omitted if a defaultRegistrantEPR block is specified; in this case, the value of defaultRegistrantEPR will be used instead.
InitialTerminationTime The initial termination time of this registration (this may be omitted).
RefreshIntervalSecs The refresh interval, in seconds.
Content Aggregator-source-specific registration parameters. The content blocks for the various aggregator sources are described in detail in the following sections.

The defaultServiceGroupEPR block provides a convenient way to register a number of resources to a single service group -- for example, if you wish to register several resources to your default VO index, you can specify that index as the default service group and omit the ServiceGroupEPR blocks from each ServiceGroupRegistrationParameters block.

The defaultRegistrantEPR block provides a convenient way to register a single resource to several service groups -- for example, if you wish to register your local GRAM server to several index servers, you can specify your GRAM server as the default registrant and omit the RegistrantEPR blocks from each ServiceGroupRegistrationParameters block.

ServiceGroupRegistration Content Blocks for QueryAggregatorSource

The QueryAggregatorSource can use one of the following three configuration blocks.
GetResourcePropertyPollType
If a GetResourcePropertyPollType block is used, QueryAggregatorSource will request a single resource property. The block has this form:
   <Content xsi:type="agg:AggregatorContent"
      xmlns:agg="http://mds.globus.org/aggregator/types">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:GetResourcePropertyPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ResourcePropertyName>rp_namespace:rp_localname</agg:ResourcePropertyName>
         </agg:GetResourcePropertyPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds; the ResourcePropertyName parameter is the QName of the resource property to poll for.
GetMultipleResourcePropertiesPollType
If a GetMultipleResourcePropertiesPollType block is used, QueryAggregatorSource will request one or more resource properties. The block has this form:
   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:GetMultipleResourcePropertiesPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ResourcePropertyNames>rp1_namespace:rp1_localname</agg:ResourcePropertyNames>
            <agg:ResourcePropertyNames>rp2_namespace:rp3_localname</agg:ResourcePropertyNames>
            <agg:ResourcePropertyNames>rp3_namespace:rp3_localname</agg:ResourcePropertyNames>
         </agg:GetMultipleResourcePropertiesPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds; the ResourcePropertyNames parameters are the QNames of the resource properties to poll for. There is no limit on the number of ResourcePropertyNames that may be specified.
QueryResourcePropertiesPollType
If a QueryResourcePropertiesPollType block is used, QueryAggregatorSource will request that a query be executed against the Resource Property Set of the remote resource. In the GT3.9.4 implementation of core, the only query language that is supported is XPath. The block has this form:
   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:QueryResourcePropertiesPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:QueryExpression Dialect="dialect">
               Query Expression
            </agg:QueryExpression>
         </agg:QueryResourcePropertiesPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds. The QueryExpression is an xsd:any element; the Dialect attribute specifies the dialect of the query expression.

ServiceGroupRegistration Content Blocks for SubscriptionAggregatorSource

The SubscriptionAggregatorSource gathers resource property values from the registered resource using WS-Notification subscriptions. The configuration block for SubscriptionAggregatorSource looks like this:

   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:AggregatorSubscriptionType>
             <TopicExpression Dialect="dialect">
                Topic Expression
             </TopicExpression>
             <Precondition Dialect="dialect">
                Precondition
             </Precondition>
             <Selector Dialect="dialect">
                Selector
             </Selector>
             <SubscriptionPolicy>
                Subscription Policy
             </SubscriptionPolicy>
             <InitialTerminationTime>time</InitialTerminationTime>
         </agg:AggregatorSubscriptionType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The only required parameter is the TopicExpression, which specifies the topic expression to use in the subscription request. [TODO: link to generic notification/subscription docs].

ServiceGroupRegistration Content Blocks for ExecutionAggregatorSource

The ExecutionAggregatorSource gathers arbitrary XML information about a registered resource by executing an external script and passing registration as parameters. The configuration block for ExecutionAggregatorSource looks like this:
   <Content xsi:type="agg:AggregatorContent"
      xmlns:agg="http://mds.globus.org/aggregator/types">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:ExecutionPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ProbeName>dummy_namespace:filename</agg:ProbeName>
         </agg:ExecutionPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds. The ProbeName parameter specifies the path name to the executable file, relative to the $GLOBUS_LOCATION/libexec/aggrexec directory. The path name should be specified as the local name part of this QName; the namespace part is ignored.

Configuring the Aggregator Sink

An aggregator sink may require sink-specific configuration (the MDS Trigger service requires sink-specific configuration; the MDS Index service does not). See the documentation for the specific aggregator service being used for details on sink-specific documentation.

Environment variable interface

There are no environment variables specific to the aggregator.