Haftungsausschluss: Dieser Beitrag ist ziemlich lang, da ich versucht habe, alle relevanten Konfigurationsinformationen bereitzustellen.

Status und Problem:

Ich verwalte einen GPU-Cluster und möchte Slurm für das Jobmanagement verwenden. Leider kann ich keine GPUs mit dem jeweiligen generischen Ressourcen-Plugin von slurm anfordern.

Hinweis: test.sh ist ein kleines Skript, das die Umgebungsvariable CUDA_VISIBLE_DEVICES druckt.

Job mit `--gres=gpu:1`ausführen wird nicht abgeschlossen

Das Ausführen srun -n1 --gres=gpu:1 test.shführt zu folgendem Fehler:

srun: error: Unable to allocate resources: Requested node configuration is not available

Log:

gres: gpu state for job 83
    gres_cnt:4 node_cnt:0 type:(null)
    _pick_best_nodes: job 83 never runnable
    _slurm_rpc_allocate_resources: Requested node configuration is not available

Das Ausführen des Jobs mit `--gres=gram:500`ist abgeschlossen

Wenn ich srun -n1 --gres=gram:500 test.shjedoch anrufe , wird der Job ausgeführt und gedruckt

CUDA_VISIBLE_DEVICES=NoDevFiles

Log:

sched: _slurm_rpc_allocate_resources JobId=76 NodeList=smurf01 usec=193
debug:  Configuration for job 76 complete
debug:  laying out the 1 tasks on 1 hosts smurf01 dist 1
job_complete: JobID=76 State=0x1 NodeCnt=1 WIFEXITED 1 WEXITSTATUS 0
job_complete: JobID=76 State=0x8003 NodeCnt=1 done

Daher scheint slurm korrekt konfiguriert zu sein, um Jobs srunmit angeforderten generischen Ressourcen --gresauszuführen, erkennt den GPus jedoch aus irgendeinem Grund nicht.

Meine erste Idee war, einen anderen Namen für die generische GPU-Ressource zu verwenden, da die anderen generischen Ressourcen zu funktionieren scheinen, aber ich möchte mich an das GPU-Plugin halten.

Aufbau

Der Cluster hat mehr als zwei Slave-Hosts, aber der Übersichtlichkeit halber werde ich mich an zwei leicht unterschiedlich konfigurierte Slave-Hosts und den Controller-Host halten: papa (controller), smurf01 und smurf02.´

slurm.conf

Die generisch resrouce-relevanten Teile der Slurm-Konfiguration:

...
TaskPlugin=task/cgroup
...
GresTypes=gpu,ram,gram,scratch
...
NodeName=smurf01 NodeAddr=192.168.1.101 Feature="intel,fermi" Boards=1 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=2 Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
NodeName=smurf02 NodeAddr=192.168.1.102 Feature="intel,fermi" Boards=1 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=1 Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
...

Hinweis: RAM ist in GB, Gramm ist in MB und Scratch wieder in GB.

Ausgabe von `scontrol show node`

NodeName=smurf01 Arch=x86_64 CoresPerSocket=6
   CPUAlloc=0 CPUErr=0 CPUTot=24 CPULoad=0.01 Features=intel,fermi
   Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
   NodeAddr=192.168.1.101 NodeHostName=smurf01 Version=14.11
   OS=Linux RealMemory=1 AllocMem=0 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1
   BootTime=2015-04-23T13:58:15 SlurmdStartTime=2015-04-24T10:30:46
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=smurf02 Arch=x86_64 CoresPerSocket=6
   CPUAlloc=0 CPUErr=0 CPUTot=12 CPULoad=0.01 Features=intel,fermi
   Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
   NodeAddr=192.168.1.102 NodeHostName=smurf02 Version=14.11
   OS=Linux RealMemory=1 AllocMem=0 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1
   BootTime=2015-04-23T13:57:56 SlurmdStartTime=2015-04-24T10:24:12
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

smurf01 Konfiguration

GPUs

 > ls /dev | grep nvidia
nvidia0
... 
nvidia7
 > nvidia-smi | grep Tesla
|   0  Tesla M2090         On   | 0000:08:00.0     Off |                    0 |
... 
|   7  Tesla M2090         On   | 0000:1B:00.0     Off |                    0 |
...

gres.conf

Name=gpu Type=tesla File=/dev/nvidia0 CPUs=0
Name=gpu Type=tesla File=/dev/nvidia1 CPUs=1
Name=gpu Type=tesla File=/dev/nvidia2 CPUs=2
Name=gpu Type=tesla File=/dev/nvidia3 CPUs=3
Name=gpu Type=tesla File=/dev/nvidia4 CPUs=4
Name=gpu Type=tesla File=/dev/nvidia5 CPUs=5
Name=gpu Type=tesla File=/dev/nvidia6 CPUs=6
Name=gpu Type=tesla File=/dev/nvidia7 CPUs=7
Name=ram Count=48
Name=gram Count=6000
Name=scratch Count=1300

smurf02 Konfiguration

GPUs

Gleiche Konfiguration / Ausgabe wie smurf01.

gres.conf auf smurf02

Name=gpu Count=8 Type=tesla File=/dev/nvidia[0-7]
Name=ram Count=48
Name=gram Count=6000
Name=scratch Count=1300

Hinweis: Die Deamons wurden neu gestartet, die Maschinen wurden ebenfalls neu gestartet. Der Slurm und der Job sendende Benutzer haben dieselben IDs / Gruppen auf Slave- und Controller-Knoten und die Munge-Authentifizierung funktioniert ordnungsgemäß.

Protokollausgaben

Ich habe DebugFlags=Gresin der Datei slurm.conf hinzugefügt und die GPUs scheinen vom Plugin erkannt zu werden:

Controller-Protokoll

gres / gpu: state for smurf01
   gres_cnt found : 8 configured : 8 avail : 8 alloc : 0
   gres_bit_alloc :
   gres_used : (null)
   topo_cpus_bitmap[0] : 0
   topo_gres_bitmap[0] : 0
   topo_gres_cnt_alloc[0] : 0
   topo_gres_cnt_avail[0] : 1
   type[0] : tesla
   topo_cpus_bitmap[1] : 1
   topo_gres_bitmap[1] : 1
   topo_gres_cnt_alloc[1] : 0
   topo_gres_cnt_avail[1] : 1
   type[1] : tesla
   topo_cpus_bitmap[2] : 2
   topo_gres_bitmap[2] : 2
   topo_gres_cnt_alloc[2] : 0
   topo_gres_cnt_avail[2] : 1
   type[2] : tesla
   topo_cpus_bitmap[3] : 3
   topo_gres_bitmap[3] : 3
   topo_gres_cnt_alloc[3] : 0
   topo_gres_cnt_avail[3] : 1
   type[3] : tesla
   topo_cpus_bitmap[4] : 4
   topo_gres_bitmap[4] : 4
   topo_gres_cnt_alloc[4] : 0
   topo_gres_cnt_avail[4] : 1
   type[4] : tesla
   topo_cpus_bitmap[5] : 5
   topo_gres_bitmap[5] : 5
   topo_gres_cnt_alloc[5] : 0
   topo_gres_cnt_avail[5] : 1
   type[5] : tesla
   topo_cpus_bitmap[6] : 6
   topo_gres_bitmap[6] : 6
   topo_gres_cnt_alloc[6] : 0
   topo_gres_cnt_avail[6] : 1
   type[6] : tesla
   topo_cpus_bitmap[7] : 7
   topo_gres_bitmap[7] : 7
   topo_gres_cnt_alloc[7] : 0
   topo_gres_cnt_avail[7] : 1
   type[7] : tesla
   type_cnt_alloc[0] : 0
   type_cnt_avail[0] : 8
   type[0] : tesla
...
gres/gpu: state for smurf02
   gres_cnt found:TBD configured:8 avail:8 alloc:0
   gres_bit_alloc:
   gres_used:(null)
   type_cnt_alloc[0]:0
   type_cnt_avail[0]:8
   type[0]:tesla

Slave-Protokoll

Gres Name = gpu Type = tesla Count = 8 ID = 7696487 File = / dev / nvidia[0 - 7]
...
gpu 0 is device number 0
gpu 1 is device number 1
gpu 2 is device number 2
gpu 3 is device number 3
gpu 4 is device number 4
gpu 5 is device number 5
gpu 6 is device number 6
gpu 7 is device number 7

cluster hpc job-scheduler Pixchem
quelle

Was passiert auf Anfrage --gres=gpu:tesla:1?

NNWizard

@NMWizard Das gleiche wie ohne einen angegebenen Typ.

Pixchem

Warum schlägt das Anfordern von GPUs als generische Ressource in einem Cluster, in dem SLURM mit dem integrierten Plugin ausgeführt wird, fehl?

Status und Problem:

Job mit `--gres=gpu:1`ausführen wird nicht abgeschlossen

Das Ausführen des Jobs mit `--gres=gram:500`ist abgeschlossen

Aufbau

slurm.conf

Ausgabe von `scontrol show node`

smurf01 Konfiguration

GPUs

gres.conf

smurf02 Konfiguration

GPUs

gres.conf auf smurf02

Protokollausgaben

Controller-Protokoll

Slave-Protokoll

Antworten:

Warum schlägt das Anfordern von GPUs als generische Ressource in einem Cluster, in dem SLURM mit dem integrierten Plugin ausgeführt wird, fehl?

Status und Problem:

Job mit --gres=gpu:1ausführen wird nicht abgeschlossen

Das Ausführen des Jobs mit --gres=gram:500ist abgeschlossen

Aufbau

slurm.conf

Ausgabe von scontrol show node

smurf01 Konfiguration

GPUs

gres.conf

smurf02 Konfiguration

GPUs

gres.conf auf smurf02

Protokollausgaben

Controller-Protokoll

Slave-Protokoll

Antworten:

Job mit `--gres=gpu:1`ausführen wird nicht abgeschlossen

Das Ausführen des Jobs mit `--gres=gram:500`ist abgeschlossen

Ausgabe von `scontrol show node`