SMP (symmetric multi-threaded) jobs
=============================================

An SMP job requests multiple CPU cores to run multi-threaded programs **within a single compute node**. One can use this mechanism to run either the multi-threaded programs, or MPI programs. 

In the case of MPI programs, all MPI processes will run within a compute node.

The key job scheduler parameter is::

    #$ -pe shared N

where ``N`` is the number of CPU cores to use for the job. This value depends on how the program uses the multiple cores. A greater ``N`` value does not necessarily imply the program running faster.

Key points:

- A sequential program does not automatically become multi-threaded just because it is submitted as a shared-memory job 
- Use ``-pe shared N``
- The ``h_data`` value is per-core.
- Total memory is (h_data) * (-pe shared N). The job will not start if this product is "too big".
- If ``N`` is too large and the scheduler cannot find an available compute node, the job will not start.


Consider this example:

.. code-block:: bash

    #!/bin/bash

    #$ -cwd
    #$ -l h_rt=8:00:00,h_data=4g
    #$ -pe shared 16
    #$ -o Stdout.$JOB_ID


This job requires ``4g*16 = 64g`` of memory on a compute node to start.

- It will not start on compute nodes having only 32G of memory.
- Even if it could start on a 64GB (or greater) compute node, there will be some wait time because the scheduler needs to "drain" a whole 64GB spot to run this job.
- If the program is sequential (not multi-threaded), 15 CPU cores will be idle (wasted!)
- With `-pe shared`, all requested cores will be on one compute node. Nothing will run across compute nodes.

- If your program is sequential but needs a big amount of memory, do not use ``-pe shared``! Just use, e.g. ``-l h_data=64G`` (without ``-pe shared N``) to run large-memory, sequential programs.


.. toctree::
    :maxdepth: 1

    quiz.md