home | tech | misc | code | bookmarks (broken) | contact | README


Maui notes

As said in the TORQUE notes page, a scheduler is needed to if you want more than a batch queue to work in your HPC environment. That is why we use Maui, which is an opensource scheduler.

Installation

After getting Maui from its website, unpack it:

# tar zxf maui-3.2.6p17.tar.gz
# cd maui-3.2.6p17
# ./configure --with-pbs=$TORQUE_HOME

Where $TORQUE_HOME is the path of the TORQUE installation if not in /usr/local.

Important

For some reason, the --prefix parameter of the configure script is not respected. When using PREFIX other than /usr/local/maui, it installs something in the user defined directory, but still insists in installing something in /usr/local/maui (developers use it hardcoded). So, better not to use --prefix.

After that, just compile and install it:

# make
# make install

In the configuration file, /usr/local/maui/maui.cfg, I needed to make a small change (don't know exactly why). I changed the following line:

RMCFG[HOSTNAME] TYPE=PBS@RMNMHOST@

to:

RMCFG[HOSTNAME] TYPE=PBS

After that, let's make sure TORQUE is not using its native and simple scheduler. (You also might want to delete it from the init scripts if they are there):

# pkill pbs_sched

And start the maui daemon (You also might want to add it to the init scripts of your system):

# /usr/local/maui/sbin/maui

Logging

To make sure maui is communicating with TORQUE, I prefered to decrease the scheduler_iteration parameter of TORQUE:

# set server scheduler_iteration = 30

And watched the logs in /usr/local/maui/log/maui.log. TORQUE jobs submitted from qsub should appear on the logs for each iteraction.

Operation

Maui holds

Besides TORQUE holds, (see my TORQUE notes), there are also Maui holds. You can check the existance of a hold for any job with the checkjob command. Example:

# checkjob 27332

(...)

Holds:    Defer
Messages:  exceeds available partition procs
PE:  8.00  StartPriority:  4000
cannot select job 27332 for partition DEFAULT (job hold active)

In this case we see it is deferred because I had to turn off nodes and didn't stop the scheduler (that can be done with schedctl -s). So, TORQUE tried to run than with Maui, and Maui rejected, putting a hold on it.

After turning on nodes, it was necessary to release the jobs with the releasehold command. Do not confuse with TORQUE's command qrls.

Reservations

Every submition of a job makes reservations on nodes and processors mapped by Maui (told by TORQUE) so different jobs don't use the same processors. All reservations can be checked with the showres command:

# showres
Reservations

ReservationID       Type S       Start         End    Duration    N/P    StartTime

29763                Job R -1:03:12:05  1:20:47:55  3:00:00:00    8/64   Tue Oct 14 10:24:29
29764                Job R -1:03:11:12  1:20:48:48  3:00:00:00    8/64   Tue Oct 14 10:25:22
29780                Job R    -9:27:48  1:02:32:12  1:12:00:00    1/1    Wed Oct 15 04:08:46
29781                Job R    -7:12:18  1:04:47:42  1:12:00:00    1/1    Wed Oct 15 06:24:16
29782                Job R    -5:22:34  1:06:37:26  1:12:00:00    1/1    Wed Oct 15 08:14:00
29783                Job R    -3:17:20  1:08:42:40  1:12:00:00    1/1    Wed Oct 15 10:19:14
29784                Job R    -1:45:40  1:10:14:20  1:12:00:00    1/1    Wed Oct 15 11:50:54
29785                Job R    -1:31:33  1:10:28:27  1:12:00:00    1/1    Wed Oct 15 12:05:01
29786                Job R    -1:31:33  1:10:28:27  1:12:00:00    1/1    Wed Oct 15 12:05:01
29807                Job R    -8:46:05  2:15:13:55  3:00:00:00    2/16   Wed Oct 15 04:50:29
29808                Job I  4:21:23:26  7:21:23:26  3:00:00:00    8/64   Mon Oct 20 12:00:00
29809                Job R    -1:54:34  1:22:05:26  2:00:00:00    1/8    Wed Oct 15 11:42:00
SYSTEM.0            User -  2:22:23:26  4:21:23:26  1:23:00:00   22/176  Sat Oct 18 12:00:00

13 reservations located

Each reservation has its ID as the ID of the job. But see that we have a reservation called SYSTEM.0 that reserves all nodes (22) and all processors (176) of the current system for a given time interval. This was necessary because at this interval the cluster will be turned off and we don't want jobs to begin execution within this period.

So we created a manual reservation with the setres command:

setres -s 12:00:00_10/18 -e 12:00:00_10/20 ALL