Components for Grid Monitoring and Discovery
Sections
Grid systems and applications are usually intended to be "persistent," meaning that they are expected to be available to users on an ongoing basis as much of the time as possible. While specific parts of the infrastructure (computational elements, storage systems, instruments and telepresence sites, etc.) may be taken offline or added to the system on a dynamic basis, the system or application as a whole should remain available as much as possible, adapting gracefully to the changing availability of infrastructure elements.
This model of availability creates requirements for monitoring and discovering infrastructure elements and services. The OGSA architecture and the Globus Toolkit provide a core architecture and an implementation (respectively) for publishing, locating, and subscribing to information. The Grid community has developed several specialized systems for system monitoring that are useful both as stand-alone mechanisms and as elements within the OGSA architecture.
Related solutions: The Solutions section of this website provides examples of these components being used in scientific projects. See especially the solution titled A Monitoring System for the Earth System Grid.
Basic Monitoring and Discovery Mechanisms
The OGSA architecture and the Globus Toolkit provide a core architecture and an implementation (respectively) for publishing, locating, and subscribing to information.
- WS Core Monitoring Features - Uniform mechanisms for obtaining status details from Web services and for subscribing to properties of interest
- Globus Index Service - A collection point for status information from other services on a Grid
Specialized Monitoring and Discovery Components
The Grid community has developed several specialized systems for system monitoring that are useful both as stand-alone mechanisms and as elements within the OGSA architecture.
- Globus Trigger Service - A service that monitors WSRF resource properties and, when preconfigured patterns are matched, triggers actions
- Ganglia Cluster Toolkit - A toolkit that specializes in collecting monitoring data from clusters and hierarchical aggregations of clusters
- Inca - A generic framework for automated testing, verification, and monitoring of service-level agreements
- MonALISA - A distributed monitoring tool that features proxy services to enable use with firewalls and a wide variety of client interfaces including JINI and WAP