Submitting batch jobs
A simple example
Consider this job script (job1.sh
) of running Matlab on 1 CPU core:
1#!/bin/bash
2#$ -l h_rt=01:00:00
3#$ -l h_data=4G
4#$ -N job_name
5#$ -cwd
6#$ -o stdout.$JOB_ID
7#$ -e stderr.$JOB_ID
8
9source /u/local/Modules/default/init/modules.sh
10module load matlab/R2020a
11
12matlab -nodisplay -nojvm -nosplash -singleCompThread < svd.m
Since this is our first job script example, we will explain each component in details. In later sections, we will consider other job types. You will find that submitting other job types to be very similiar to this basic one.
Job parameter
Option |
Purpose |
---|---|
|
wall-clock time limit |
|
memory size |
|
job name (optional) |
|
use the current working directory |
|
file name of standard output |
|
file name of standard error |
Additional comments:
This job script is bash (shell) script (indicated by the line,
#!/bin/bash
), so the body of the script has to written in bash syntax.Without
-cwd
, the output will be written to the user’s top level$HOME
directory. Use-cwd
to run in the current directory, and to keep your top-level$HOME
clean.A batch job does not have a “screen” attached to it; all “screen output” of the program will go here. Similarily for the
-e
option.The order of these job parameters does not matter.
The standard input and output files may be combined by (
-e
is omitted):#$ -j y #$ -o stdout.$JOB_ID
The
$JOB_ID
variable makes the stdout/stderr file names unique for different jobs.In the job parameter block (prefixed by
#$
), these environment variables are supported:$JOB_ID
,$TASK_ID
(see “job array”),$JOB_NAME
,$HOSTNAME
,$USER
and$HOME
By default, the job will allocate one CPU core to run the computation. That’s why we also force Matlab to run in the single-thread mode using the option -singleCompThread. Running computations exceeding the requested computing resources might cause the job to be terminated without notification.
A batch job does not have a “screen” attached to it. That’s why we turn off Matlab’s GUI (by
-nodisplay -nojvm -nosplash
) to conserve memory consumption. Otherwise the memory size quest (h_data
) will be unnecessarily much larger.
Common mistakes
Adding spaces in between items, e.g.
#$ -l h_rt=1:00:00, h_data=4G # <-- this is wrong!!
In this case, due to syntax error the specified
h_data
value is not captured by the job scheduler. Instead, the default 1G is used, which may (or may not) be too small for the job.
Missing the unit for memory size, e.g.
#$ -l h_rt=1:00:00,h_data=4
In this case, the job will allocate just 4 bytes of memory to run. Most likely it will fail.
Fail to initialize Modules, or fail to
module load
, resulting in “command not found
” error messages.When a job starts, it is run in a non-login shell in which Modules is not initialized. Consequently, it is necessary to initialize Modules in a job script (line 9 in the example above).