Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In order to access our Bull XeonPhi (i.e. Bull-MIC) partitions , you need to login to one of the Taito-cluster's login node first (taito.csc.fi).
From there you need to login to Bull-MIC system's interactive node (m1). This awkward arrangement maybe temporary. Optimally we ought to
be able to login directly to system like taito-mic.csc.fi (.. but that does not exist), but such flexibility was difficult to obtain at the time of
writing this documentation. So we need the following two-step login-procedure to get into the business :


% ssh taito.csc.fi
% ssh m1


The interactive node is used to develop your application program for MICs. It consists of host CPU (an Ivy Bridge) and two MIC-cards (XeonPhi 7120X).
The auto-offload execution (AOE) mode developed at CSC by Olli-Pekka Lehto ensures that also configure + make type of build scripts work seamlessly.

Normal MIC-workflow assumes that you have an application (a program, executable), which written in C/C++ or Fortran. The application can be
written for host CPUs only (host mode), or MIC-cards only (so called native mode) or using offload mode.

Combined with MPI message passing and OpenMP multithreading you could in principle create two executables : one for running on host CPUs and
the other for running on MIC-cards – and yet through MPI these can be viewed as a single application running in so called symmetric mode.

But as we currently have problems getting MPI-layers to use the fast Infibiband network between MIC-cards themselves as well as communicating
with host CPUs, the offload mode is the only preferred way of using powerful MIC-cards at the moment.

In the offload-mode we create an application program which is compiled for host CPUs. However, we have to insert so called offload directives (either Intel's
proprietary ones or from OpenMP4.0 standard's target directives, or by use of offload-MKL library) into our code. During the execution these directives
divert execution on to the MIC-card(s) a bit similar fashion than GPUs do e.g. via CUDA or OpenACC.

Lets create a simple SLURM batch job file, where we inquire hostname and O/S system information for host CPU and MIC-cards.
The batch queue (or partition) in concern need to be "mic" (i.e. -pmic) with option --gres=mic:2 supplied.
Here is the script (micinfo.slurm) that allocates two (2) MICs on a single host CPU node :

% cat micinfo.slurm


#!/bin/bash
#SBATCH -N 1
#SBATCH -p mic
#SBATCH -t 00:05:00
#SBATCH -J micinfo
#SBATCH -o micinfo.out.%j
#SBATCH -e micinfo.out.%j
#SBATCH --gres=mic:2
#SBATCH --exclusive
#SBATCH

module purge
module load intel/14.0.1 mkl/11.1.1 intelmpi/4.1.3
module list

set -xv

cd ${SLURM_SUBMIT_DIR:-.}
pwd

uname -a
ssh mic0 uname -a
ssh mic1 uname -a

hostname
ssh mic0 hostname
ssh mic1 hostname


To submit this job, the following command can be used :

% sbatch micinfo.slurm

The output file ( micinfo.out.<slurm_job_id_number> ) looks as follows :


Currently Loaded Modules:
  1) intel/14.0.1    2) mkl/11.1.1    3) intelmpi/4.1.3

cd ${SLURM_SUBMIT_DIR:-.}
+ cd /wrk/sbs/GuideBull/MickeyMouse
pwd
+ pwd
/wrk/sbs/GuideBull/MickeyMouse

uname -a
+ uname -a
Linux m2 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
ssh mic0 uname -a
+ ssh mic0 uname -a
Linux m2-mic0 2.6.38.8+mpss3.2 #1 SMP Fri Mar 14 11:46:45 PDT 2014 k1om GNU/Linux
ssh mic1 uname -a
+ ssh mic1 uname -a
Linux m2-mic1 2.6.38.8+mpss3.2 #1 SMP Fri Mar 14 11:46:45 PDT 2014 k1om GNU/Linux

hostname
+ hostname
m2
ssh mic0 hostname
+ ssh mic0 hostname
m2-mic0
ssh mic1 hostname
+ ssh mic1 hostname
m2-mic1