The main access to the nodes of the Beowulf Cluster is done by the Sun Grid Engine batch system. Grid Engine will distribute requested jobs on the nodes, depending on the current load of the nodes, the priority of the job and the numbers of jobs a user has already running on the cluster (jobs in the queue of users, which have fewer jobs running, are preferred within the same priority level). Direct login onto the nodes and interactive executions of programs are strongly discourage, because it bypasses the monitoring system of the nodes by the Sun Gridengine and can cause incomplete execution of batch jobs. If interactive jobs are required by some users they can use the command qsh, which starts an xterm session through Grid Engine.
Programs cannot be submitted directly to the grid engine. Instead they require a small shell script, which is a wrapper for the program to be run. Note that the script must be an executable (check with the ls -l command. If there is not an x in front of the shell script name, it is not executable. It can be changed with the command chmod +x <script name> ). If the program requires interactive input (e.g. Genesis) the input has to be piped in by either the echo command or an external file. The minimal script genesis.sh to run Genesis would be:
#$ -S /bin/sh
echo "lcls.in" | ~/bin/genesis
Note that this is a specific case, which requires that the executable of genesis is located in the directory bin of your home directory. After a check that the script runs correctly (typing ./genesis.sh at the prompt should execute genesis without an error), the job is submitted with the qsub command:
The command qsub has many option which should be explicitly defined for each submitted job. There are three methods of doing so with increasing priority (a higher priority will overwrite an already defined option of a lower priority):
In any case an option starts always with a minus sign and a keyword, followed - if necessary - by additional arguments. Following options are recommended to be set, preferable by the .sge_request file in the home directory:
|-cwd||Uses the directory, where the job has been submitted, as the working directory. Otherwise your home directory is used.|
|-C #$||Defines the letter sequence in the script which indicates additional option for submitting the job.|
|-A <login-name>||Defines the user account of the job owner. If not defined it falls back to the user who submitted the job.|
|-j y||Merges the normal output of the file and any error messages into one file, typically with the name <job-name>.o<job-id>.|
|-m aes||Sun Grid Engine will notify the job owner by email if the job is either completed, suspended or aborted.|
|-M <email-address>||The email address to where the notification is send.|
|-p 0||The priority level of the submitted jobs. Jobs with a higher priority are preferred to be submitted to a node by the grid engine.|
|-r||forces grid engine to restart the job in the case the system has a crash or is rebooted (note, this does not apply if the job itself crashes).|
Following option should be defined differently for each job, because they are defined in a context to the specific jobs which is not generally applicable for all jobs.
|-N <job-name>||Defines a short name for the job to identify it besides the job ID. If omitted the job name is the name of the shell script|
|-o <outputfile>||Names the output file. If omitted the output filename is the defined by <job-name>.o<job-id>|
|-v <environment>||Normally environment variables, defined in your .bash_profile or relarted file, are not exported to the node, where the job runs. With this option grid engine sets the environment variable prior to starting the job.|
|-notify||If the code supports the signals SIGUSR1 and SIGUSR2, these signal will be sent to the program before it is terminated by the grid engine|
|-pe <parallel environment>||Needed for executing parallel jobs|
Use man qsub to see further option. All options can also be set/defined in an interactive way by using the job submission feature of qmon.
Once the job is submitted a job id is assigned and the job is placed in the queue. To see the status of the queue the command qstat prints a list of all running and pending jobs with a list of the most important information (job ID, job owner, job name, status, node). More information on a specific job can be optain with qstat -j <job-id>. The status of the job is indicated by one or more characters:
|t||-||transfering to a node|
|qw||-||waiting in the queue|
|d||-||marked for deletion|
|R||-||marked for restart|
Normally the status d is hardly observed with qstat and if a job hangs in the queue for a long time, marked for deletion, it indicates that the grid engine is not running properly. Please inform the system administrator about it.
To remove a job from the queue, the command qdel only requires the job-id. A job can also be changed after it has been submitted with the qalter command. It works similar to the qsub commmand but with the job-id instead of the shell script name.
The command qhost gives the status of all nodes. If the load is close to unity it indicates that the machine is busy and most likely running a job (use the qstat command to check - if not then a user might have logged directly onto the node to run a job interatively).
To run a parallel job the script requires some additional information. First the option -pe has to be used to indicate the parallel environment. Right now only mpich is supported on the Beowulf cluster. The second mandatory argument for the pe-optionn is the number of requested nodes, which can be also defined as a range of needed nodes. Sun gridengine tries to maximized this number. It is recommmended to add this line to the shell script
#$ -pe mpich N
where N is the number of the desired nodes. Right now it is limited to 14, corresponding loosely to one job per node/CPU. If mulitple instances per node are required, please contact the system administrator to increase the maximum number of slots.
The invocation of mpirun requires also some non-standard place holders (environmental variables), which is then filled by grid engine at the execution of the script. The format is (one line!)
/usr/local/mpich/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines <path to mpi program + optional command line arguments>
Everything up to the path to the mpi program should be used as it is. $NSLOTS and $TMPDIR will be defined by the sun grid engine. Not also that this script does not run correctly if it is executed directly. Further information on MPICH can be found here.
If the user has to run interactive session (e.g. Oopics) it can log onto a node with the qsh command. The Sun Grid Engine will then mark that node as busy and do not submit any further job to it till the user has logged out. The command qstat will show INTERATIVE as the job name, indicating that an interactive session is running on that node.
For now the command qsh is not working properly, but the system adminsitrator is currently working on it to fix it.
QMON is a user interface to replace all of the UNIX commands of the grid engine (e.g. qsub, qdel ...). It is started by typing qmon at the command prompt, follow by a space and an ampersand, so that the prompt is not blocked. For the normal user only the first three buttons are of importance. They correspond to qstat, qhost and qsub, respectively. The usage is mostly intuitive. You can ask also the system administrators for help. It is recommended that at least once the job submission panel is used to define your default parameters and to save the settings. After filling out the parameter press the 'Save Setting' button and name the file to be written. The generated file can be used as a template for .sge_request.
Submits a job to the queue. It requires a shell scripts, which is wrapped around the program to be run. Options can be either defined as command line arguments, in the script file or by the .sge_request file in your home directory. See above for more information.
Marks a job for deletion. It requires the job-id and not the job name, which can be ambigious.
Change the options for an already submitted job. The options are the same as for qsub but requires the job-id instead of the shell script name. If the job is already running it will be restarted.
Shows the status of the queue or of a specific job if it is specified with the -j <job-id> option.
Shows the status of the nodes.
Starts an xterm session through the grid engine for interactive jobs.
Puts a job, which hasn't been startet yet on hold and is not schedeled for execution by the gridengine till the hold is removed. Requires the job-id as argument.
Releases a job from a hold. It will be put back in the queue and schedule for execution. Requires the job-id
Interactive monitor of the sun gridengine.
More information can be obtain by the man command at the command prompt. The User and Adminstration guide gives a complete discription of the sun gridengine, which most can be also found on the official homepage.