MOPS > GFork Master Plugin for GridFTP

This information is provided as a preview only and will be moved to development documentation for the next dev release (4.1.3).

This page describes the GFork plugin for GridFTP as well as configuring for striped servers and memory management. For more information about GFork, click here.

Running the Globus GridFTP Server With GFork

Running the globus-gridftp-server under GFork is almost identical to running it under xinetd. First, you need a configuration file:

service gridftp2
{
    env += GLOBUS_LOCATION=<path to GL>
    env += LD_LIBRARY_PATH=<path to GL>/lib
    server = <path to GL>/sbin/globus-gridftp-server
    server_args = -i 
    server_args += -d ALL -l <path to GL>/var/gridftp.log
    port = 5000
}

That portion is identical to xinetd. In fact, an existing xinetd configuration file should work.

When running GridFTP out of GFork, the server should be run with a master program. The master program provides enhanced functionality such as dynamic backend registration for striped servers, managed system memory pools and internal data monitoring for both striped and non-striped servers. For information about what a GFork master program is, click here. To run with a master program, the following two lines are needed in the config file.

master = <path to GL>/libexec/gfs-gfork-master
master_args = <options>

The first line tells GFork what master program to use (for the GridFTP server, we use gfs-gfork-master). The second line provides options to the master program

The full list of master options are as follows (this is to date only, run the program with --help for newer options):


-b | --reg-cs <contact string>
Contact to the frontend registry. This option makes it a data node.
-df | --dn-file <path>
Path to a file containing the list of acceptable DNs. Default is system gridmap file.
-G | --gsi <bool>
Enable or disable GSI. Default is on.
-h | --help
Print the help message.
-l | --logfile <path>
Path to the logfile.
-p | --port <int>
Port where the server listens for connections.
-s | --stripe-count <int>
The maximum number of stripes to give to each server. A value of 0 indicates all stripes are available.
-u | --update-interval <int>
Number of seconds between registration updates.

Once you have a configuration file, run gfork with:

% gfork -c <path to config file>

Striped Servers

GridFTP offers a powerful enhancement called striped servers. In this mode a GridFTP server is set up with a single frontend and one or more backends. All of the backends work in concert to transfer a single file and thereby achieve high throughput rates. More information can be found about striped servers on the GridFTP release manuals. Here we describe how to configure one frontend and multiple backends for use as a striped server.

Frontend Configuration

The frontend server described here is run using dynamic backends. We need additional options for both the GridFTP server and the master program. The following lines are added to the config file:

server_args += -dsi remote 
master_args = -port 8588
master_args += -df <path to gridmap file>

The first line is an additional argument to the GridFTP server. It tells the server that it will be operating in split mode (separate frontend and backend processes) and that it will be using the frontend. (Specifically it tells the server to use the 'remote' DSI).

The second line tells the master program on which port it should listen for backend registrations. Backend services can then connect to this port to notify the frontend of their existence. By default, a registration is good for 10 minutes, but a backend is free to refresh its registration. In this way, a frontend is provided with the list of possible backends (stripes) which may be used for a transfer.

The third line provides the master program with a list of authorized DNs. Each line in the file must contain a GSI DN (certificate subject). In order to register, the backend must authenticate and provide its DN. The provided DN is checked against this file. In other words, the file is a list of DNs that may register with the frontend. If the master program is not given a -df option and is given the -G option, then there is no registration security at all.

Backend Configuration

Any striped server setup can have more than one backend service. Furthermore, any one computer can run multiple backends. The following explains how to set up a backend server. These steps should be repeated for each needed backend instance.

A backend server may also be run with GFork, it just needs different options for both the GridFTP server and the master program. A sample backend config file is shown here:

service gridftp2
{
    env += GLOBUS_LOCATION=<path to GL>
    env += LD_LIBRARY_PATH=<path to GL>/lib
    
    server = <path to GL>/sbin/globus-gridftp-server
    server_args = -i  
    server_args += -dn
    master = <path to GL>/libexec/gfs-gfork-master
    master_args = -b localhost:8588
}

Notable additions to this file are:

server_args += -dn
master_args = -b localhost:8588

The first line tells the GridFTP server that it will be a 'data node', which is another name for a backend.

The second line tells the master program two things, first that it will be a master of a data node, and second what the frontend's registration contact point is. Note that in our example we have a hostname of 'localhost' and a port of '8588'. 8588 is (and must be) the same port that was provided to the frontend's master program in the previous step.

Once the configuration file is complete, run GFork again as follows:

% gfork -c <conf file>

This will start up the data node and the master program will register itself to the frontend and refresh its registration every 5 minutes (default setting).

GridFTP with Memory Management

Another feature of the GridFTP GFork plugin is memory usage limiting. Under extreme client loads, it is possible that GridFTP servers require more memory than the system has available. Due to a common kernel memory allocation scheme known as optimistic provisioning, this situation can lead to a full consumption of memory resources and thus trigger the out of memory handler. The OOM handler will kill processes in a difficult-to-predict way in order to free up memory. This will leave the system in an unpredicatable and unstable state; obviously, this is a situation that we want to avoid.

To control this situation, the GridFTP GFork plugin has a memory limiting option. This will attempt to limit memory usage to a given value or to the maximum amount of RAM in the system. Most of the memory is given to the first few connections, but when the plugin detects that it is overloaded, each session is limited to half the available memory.

To enable this feature, one of two options must be passed to the master program via the master_args in the config file:

-m
Limits memory consumption to amount of RAM in the system.
-M <formated int>
Limits memory to the given value.

Another important option should be provided in the GFork config file: instance. When a client connects to GFork, a GridFTP server instance is executed. This instance requires a certain amount of RAM. If connections are coming in too fast, this can act as a DOS attack. Limiting the number of allowed simultaneous connections will help the memory management algorithm do its job. This limit is set with:

instance = <int>

We recommend a value of 100 or |RAM|/2M, whichever is smaller.

The following is an example of a GFork configuration file with memory limiting enabled:

service gridftp2
{
    instance = 100  
    env += GLOBUS_LOCATION=<path to GL>
    env += LD_LIBRARY_PATH=<path to GL>/lib
    server = <path to GL>/sbin/globus-gridftp-server
    server_args = -i
    server_args += -dn
    master = <path to GL>/libexec/gfs-gfork-master
    master_args = -M 512M
}