menu

Queuing System

Cray XC40 is managed by the Batch Scheduling Job System, PBS Professional.

For more details,please refer the Parallel servers user manulals

Queue Classes (Revised Plan 2018.8)

Queue Node Core Walltime Simultaneous Jobs Simultaueous Jobs/per User Priority
TINY1-2node~72cores30minunlimitedunlimited1
SINGLE1-4node~144cores4Day54job10job5
SMALL5-16node~576cores72hour26job6job4
MEDIUM17-32node~1152cores72hour18job4job3
LARGE33-64node~2304cores60hour6job2job2
XLARGE129-256node~9216cores24hour2job1job2
XXLARGE257-384node~13824cores24hour1job1job2
ALLAdjustable
LONG-S1-8node~288cores1week12job3job3
LONG-M9-64node~1152cores96hour3job1job4
LONG-LAdjustable(Application dependent)--- 
  • Walltime=maximum time a job submitted to the queue can stay in the execution state
  • Simultaneous Jobs=the maximum number of jobs from the same queue that can be executed simultaneously
  • Simultaneous Jobs/per User=the maximum number of jobs in the same queue from the same user
  • Red marked paramater is modified plan ver. 2018-8-10

■1 Node is 2 CPU / containing 36 cores.
■ Job allocation is per-node. Even a job of single process with single thread, one node = 24 cores are occupied.

**For more details about job classes, please check below.

Staging

Summary

Ability to copy the files and directories automatically to staging directory at the start of a job, and copy the files and directories back when the job is finished.

Processing flow of File Staging

 0. Preparing
  The first time you use staging on XC40, you must create your own stagine directory under the /work directory.

      % mkdir /work/<user name>

   1. Stage-IN

  The files and directories required to run the program are automatically copied from the Job directory in home directory (/home/username/xxxxx ) to the staging directory (/work/<user name>.)

 2. Run the computing program

 3. Stage-OUT
  Move the files and directories from the staging directory (/work/<username>) to job directory under the home directory (/home/username/xxxxx)

   * Note that files and directories copied to the staging directory(/work/<username>) are deleted after the job is completed.

JOB Script for Staging Job

   * For coping the files and directories under the job directory under the /home ("/home/Username/xxxxx") to the staging directory "/work/Username".

     #PBS -W stagein=/work/Username/Job_dir@xc40-gw0:/home/Username/In_dir/*

   * For coping the files and directories back when the job is finished

    #PBS -W stageout=/work/Username/Job_dir/*@xc40-gw0:/home/Username/Out_dir

Sample of Job Script for Staging

Example:

#PBS -N job0
#PBS -q SINGLE
#PBS -l select=1
#PBS -j oe
#PBS -W stagein=/work/Username/Job_dir@xc40-gw0:/home/Username/In_dir/*
#PBS -W stageout=/work/Username/Job_dir /*@xc40-gw0:/home/Username/Out_dir


cd /work/Username/Job_dir                                 # Copy the current directory to the staging directory.
aprun -n 1 ./a.out <input >output.txt

 

*Key Point of file staging option

•Wildcards can be used to specify the copy source.
•The name of the destination file can be renamed.
   #PBS –W stagein=/work/Username/Job_dir/input@xc40-gw0:/home/Username/In_dir/file1

•Multiple source files can be specified.
   #PBS –W stagein=“/work/Username/Job_dir@xc40-gw0:/home/Username/In_dir/file1,/work/Username/Job_dir@xc40-gw0:/home/Username/In_dir/file2”

•If you specify a directory as the copy source, the whole directory is copied.

Note that files and directories copied to the staging directory(/work/<username>) are deleted after the job is completed.

Case1: the executable file and the data file (directory) are in the same directory

   * Directory Structure
   /home/username/program/
                                  ---   /program.exe
                                  ---   /data1
   ** execution command
      program.exe -in ./data1/data2 > result.dat

  ------------------------------------------------Arrange the above configuration to staging as follows

   * Directory Structure  for staging directory  <-- As it is for temporary use only, it is better to place it on a short path
   /work/username/
                     ---  /program.exe
                     ---  /data1

  * The file required to copy back after the calculation
   /work/username/
                     ---  /result.dat
                     ---  /data1/data2

    ** execution command on staging directory
      program.exe -in ./data1/data2 > result.dat

Sample1.;  Copy all files under /home/Username/program(If there are unnecessary files in the /program directory, the I/O time will take longer)

#PBS -N mpi
#PBS -l select=1
#PBS -q TINY
#PBS -j oe
#PBS -W stagein=/work/Username/@xc40-gw0:/home/Username/program/*
#PBS -W stageout=/work/Username/*@xc40-gw0:/home/Username/program/

cd /work/Username
setenv OMP_NUM_THREADS 1

aprun -n 36 -N 36 -j 1 ./program.exe -in ./data1/data2 > ./result.dat

Sample2 : Copy the  /data1 directory and program.exe to staging directory, then copy back data2 and the result.dat file

#PBS -N mpi
#PBS -l select=1
#PBS -q TINY
#PBS -j oe
#PBS -W stagein=/work/Username/@xc40-gw0:/home/Username/program/data1,/work/Username/@xc40-gw0:/home/Username/program/program.exe
#PBS -W stageout=/work/Username/result.dat@xc40-gw0:/home/Username/program/result.dat,/work/Username/data1/data2@xc40-gw0:/home/Username/program/data1/data2

cd /work/Username
setenv OMP_NUM_THREADS 1

aprun -n 36 -N 36 -j 1 ./program.exe -in ./data1/data2 > ./result.dat

Case2: The execution file and data files are in different directories.

   * Directory Structure
   /home/username/program/
                                  ---   /program.exe
                            /examples/data1
                                              ---  /data2
                            /examples/data3
                                               --- /data4
                            /examples/data5
                                              --- /data6
   ** execution command
      program/program.exe -in ./examples/data1/data2 > result.dat

  ------------------------------------------------  Arrange the above configuration to staging as follows

   * Directory Structure for Staging directory  <-- It is temporary use only, so if you can modify the path, place it on a short path
 
  /work/username/
                     ---  /program.exe
                     ---  /data1
    * The file required to copy back after the calculation
   /work/username/
                     ---  /result.dat
                     ---  /data1/data2

   ** execution command on staging directory
       program.exe -in data1/data2 > result.dat

Sample : Copy the  /data1 directory and program.exe to staging directory, then copy back data2 and the result.dat file

#PBS -N mpi
#PBS -l select=1
#PBS -q TINY
#PBS -j oe
#PBS -W stagein=/work/Username/@xc40-gw0:/home/Username/examples/data1,/work/Username/@xc40-gw0:/home/Username/program/program.exe
#PBS -W stageout=/work/Username/result.dat@xc40-gw0:/home/Username/program/result.dat,/work/Username/data1/data2@xc40-gw0:/home/Username/examples/data1/data2

cd /work/Username
setenv OMP_NUM_THREADS 1

aprun -n 36 -N 36 -j 1 ./program.exe -in ./data1/data2 > ./result.dat

Job status checking (XC40 commands)

With XC40, if you use the PBS standard qstat command, the number of nodes in use may not be displayed correctly.
The following command can be used to check the currently running jobs.

nqstat

Job Script

Sample Script

Sample scripts are in the following directory.

/work/Samples

Please copy the scripts freely.

MPI Program: 6 nodes Job (216 processes ) , 36 processes per 1 node

  • mpiprocs parameter never been omitted.
  • place=scatter is required if you use multi-nodes .
  • Set "OMP_NUM_THREADS=1"

#!/bin/csh
#PBS -q <QUEUE>
#PBS -j oe
#PBS -l select=6:ncpus=36:mpiprocs=36  < 6nodes,MPI 36processes are run in a node.
#PBS -l place=scatter

cd $PBS_O_WORKDIR 
setenv OMP_NUM_THREADS 1

aprun -n 216 -N 36 -j 1  ./a.out <- add - j 1 option to disable hyper-thread

XC40ではプログラム実行用に特殊なコマンド"aprun"を利用します.

aprun [-n num/-N num/ -d num ] ./a.out

-n: All MPI Processes
-N: MPI processes for a node
-d: Threads number for a process
-j : Hyper-Thread



MPI/OpenMP Hybrid: 12 node job, 18 threads per process, 2 MPI processes per node.(24MPI,432 Threads)

  • ncpus = mpiprocs x OMP_NUM_THREADS  [Max. 36 (Core)]
  • place=scatter is required if you use multi-nodes
  • mpiprocs means just a number of process for a node.

#!/bin/csh
#PBS -l select=12:ncpus=36:mpiprocs=2 < 12 node, 2 MPI process x18 Threads.
#PBS -l place=scatter
#PBS -q <QUEUE>

cd $PBS_O_WORKDIR 
setenv OMP_NUM_THREADS 18

aprun -n 24 -S 1 -N 2 -d $OMP_NUM_THREADS -j 1 ./a.out <- add - j 1 option to disable hyper-thread

 

SMP Program (OpenMP, Auto parallelize):

  • If ncpus=ompthreads ,ompthreads can be omitted.

#!/bin/csh
#PBS -l select=1:ncpus=36:ompthreads=36 < 1node,36threads.
#PBS -l place=scatter
#PBS -q <QUEUE>

cd $PBS_O_WORKDIR 
setenv OMP_NUM_THREADS 36

aprun -n 1  -d $OMP_NUM_THREADS -j 1 ./a.out  <- add - j 1 option to disable hyper-thread

 

 

Job submission

1. Submit the job script to the scheduler as below.

% qsub -q <Queue> <Script>

 

2. Job submission (with mail notification)
Notification of the starting / ending the job can be requested as below,

% qsub -q <Queue> -M <mail address> -m be <Script>

   -M <contact email>
   -m be    # b -> begin(Job starting) , e -> end(Job ending)

 

3. Job deletion
Only the owner of the job can delete his already submitted jobs

% qdel <Job id>

 

 

Usage Documents (Japanese Only)

Login

Enter your username and password here in order to log in on the website:
Login