Accessing and using crill

The Crill cluster

The clusters consist of a login node (crill.cs.uh.edu) and a number of compute nodes that are configured via units called ``partitions''; e.g. in the e crill cluster, the 48-core Opteron Nodes are part of a partition called ``crill``. The crill and the whale clusters share home directories, but are otherwise separate. The only access method to both cluster from the outside world is by using ssh. If you would like to get an account, please contact gabriel [at] cs.uh.edu.

Login Node Usage

The login nodes are to be used for editing, compiling and similar activities. They are not to be used for running jobs such as parallel programs. Program runs are submitted through the SLURM scheduler.

What is SLURM?

SLURM is the ``Simple Linux Utility for Resource Management''. It is a job scheduler that's easy to use, but omits some of the features of more sophisticated schedulers (\emph{e.g.} SGE, LSF, Moab).

Instead of running programs directly on the cluster, jobs are instead submitted to SLURM, which then allocates resources to the job. The job is essentially a box that contains a request for resources and a program to run using those resources. Requests can vary from very general (e.g. ``give me 4 cores from anywhere'') to very specific (e.g. ``give me 2 nodes exclusively with at least 24 x86_64 cores and each with at least 1 GPGPU''), depending on what you want to do.

System Query

The current state of SLURM can be queried via 3 basic commands.

Command & Description info & state of partitions squeue & what jobs are running and/or waiting now scontrol & show configuration of cluster

Job Control

Jobs are submitted from the login node and run on 1 or more compute nodes. Jobs then run until they terminate in some way, e.g. normal completion, timeout, abort, end of the world.

Command & Description salloc & submit interactive (hands-on) job sbatch & submit batch (hands-off) job scancel & terminate a job sattach & get a terminal to a running batch job

Note on MPI jobs

salloc and sbatch allow you to run different types of job. In both cases, the program that is run (or indeed, multiple programs) can be of various kinds. The most common are probably sequential, and MPI. Crill comes with Open MPI and MVAPICH installed and configured globally for all users. You can switch between the MPI libraries using the modules command. An example using Open MPI is shown later. The command ompi_info shows how Open MPI has been configured.

Examples