Introduction

This tutorial aims to show how to run coupled Gromacs/Dalton multiscale simulation on production e-Infrastructure using QosCosGrid services and tools. The presented work was a join effort of the MAPPER and ScalaLife EU projects. At first please familiarize yourself with these introductory presentations:

The PL-Grid e-infrastructure

The PL-Grid e-infrastructure is a federation of resources provided by 5 leading HPC centers in Poland. In this tutorial we will use one of the six clusters that are available via QosCosGrid services.

UI (User Interface) machine

We will use UI machine located in Wroclaw Networking and Supercomputing Center. At first we must copy the keystore file that we have generated in the PL-Grid portal.

scp plgmamonski.p12 plgmamonski@ui.plgrid.wcss.wroc.pl:~/

and convert it:

$ssh plgmamonski@ui.plgrid.wcss.wroc.pl

[plgmamonski@ui ~]$qcg_set_env plgmamonski.p12
[plgmamonski@ui ~]$exit #we must logout and login again
$ssh plgmamonski@ui.plgrid.wcss.wroc.pl

Now check if you can run qcg-list command:

$qcg-list
https://qcg-broker.man.poznan.pl:8443/qcg/services/
/C=PL/O=GRID/O=PSNC/CN=qcg-broker/qcg-broker.man.poznan.pl
UserDN = /C=PL/O=GRID/O=PSNC/CN=Mariusz Mamonski
ProxyLifetime = 23 Days 18 Hours 57 Minutes 45 Seconds
NO TASKS

Downloading input files

For your convinience all needed files can be downloaded as a single package:

$wget http://fury.man.poznan.pl/~mmamonski/MAPPER/mapper-scalalife-tutorial.tar.gz

$tar -xzf mapper-scalalife-tutorial.tar.gz

$cd tutorial/

XMPP Notifications

For the jabber notifications to work with GTalk you need to add qcg-notification@plgrid.pl address to your contacts (howto). If you do not have a jabber/gtalk account skip this step, its optional.

Submitting Gromacs Job

Submit Gromacs job using the qcg-sub command:

$cd gromacs/

$vim gromacs.qcg #EDIT #QCG notify directive

$qcg-sub gromacs.qcg 
...
gromacs.qcg 0 jobId = J1370130391659__1480

You can list all your active jobs with the qcg-list command:

$qcg-list
JOB IDENTIFIER          NOTE                  SUBMISSION TIME  START TIME       FINISH TIME      STATUS            HOSTNAME  FLAGS  STATUS DESCRIPTION    

J1370130391659__1480    GROMACS               02.06.13 01:46   02.06.13 01:46                    RUNNING           hydra     S UP  
 
 

When job is in the running state you can view it standard output with the qcg-peek command:

$qcg-peek J1370130391659__1480
...


Making 1D domain decomposition 4 x 1 x 1
starting mdrun 'S  C  A  M  O  R  G'
-1 steps, infinite ps.
step 0
 

Or peek any other file in the job directory, for example the monitor log:

$qcg-peek J1370130391659__1480 -f qcg.monitor.log
...
DEBUG: frame=40
DEBUG: dtime=0.2
INFO : extracting frames 30-40
INFO : Converted trajactories into pdbs/8ps.pdb
 

Submitting Dalton Jobs

Now we will submit two Dalton jobs that will be consuming pbd files produced by Gromcas simulation. But at first we have to update gromacsJobId in the dalton.qcg script so it points to the just submitted Gromacs job.

$cd ../dalton/

$vim dalton.qcg

#!!!!!!!!! EDIT !!!!!!!!!!
gromacsJobId=J1370130391659__1480
#also edit
#QCG watch-output
 
$qcg-sub -R 2 dalton.qcg
...

dalton.qcg 0 jobId = J1370132349776__1679
dalton.qcg 1 jobId = J1370132350342__8905
 

Now we should observe 3 jobs in the qcg-list output:

$qcg-list
J1370130391659__1480    GROMACS               02.06.13 02:03   02.06.13 02:03                    RUNNING           hydra     S UP                         
J1370132349776__1679    DALTON                02.06.13 02:19   02.06.13 02:19                    RUNNING           hydra     S UP                         
J1370132350342__8905    DALTON                02.06.13 02:19   02.06.13 02:19                    RUNNING           hydra     S UP 
 

We can also try to connect to one of the jobs:

$qcg-connect J1370130391659__1480
Connecting to the task ...
Press Ctrl-C to cancel the request
Interactive session started. Use the 'exit' command to quit.
plgmamonski@wn2017:/mnt/qcg/plgmamonski/J1370131397946__1008_task_1370131398491_381$ ls 
_stdouterr  gromacs.monitor  machinefile.unique.Y8325  mdinf.edr  mdinf.tpr  pdbs       qcg.monitor.log  stdout
debug.log   gromacs.qcg      mdinf.cpt                 mdinf.log  mdinf.xtc  qcg.debug  stderr
plgmamonski@wn2017:/mnt/qcg/plgmamonski/J1370131397946__1008_task_1370131398491_381$ # so we are inside job's working direcotry on remote cluster, we can check how it perform
plgmamonski@wn2017:/mnt/qcg/plgmamonski/J1370131397946__1008_task_1370131398491_381$ top
plgmamonski@wn2017:/mnt/qcg/plgmamonski/J1370131397946__1008_task_1370131398491_381$ # now we can go back to the ui
plgmamonski@wn2017:/mnt/qcg/plgmamonski/J1370131397946__1008_task_1370131398491_381$ exit
 

Viewing results

Wait until all jobs are finished:

$qcg-list
https://qcg-broker.man.poznan.pl:8443/qcg/services/
/C=PL/O=GRID/O=PSNC/CN=qcg-broker/qcg-broker.man.poznan.pl
UserDN = /C=PL/O=GRID/O=PSNC/CN=Mariusz Mamonski
ProxyLifetime = 23 Days 18 Hours 57 Minutes 45 Seconds
NO TASKS

You can also list already terminated jobs:

$qcg-list -S 2h -s terminated

J1370249571020__4545    GROMACS               03.06.13 10:52   03.06.13 10:52   03.06.13 11:08   FINISHED          hydra     S UP                         
J1370249913635__6557    DALTON                03.06.13 10:58   03.06.13 10:58   03.06.13 10:58   FINISHED          hydra     S P                          
J1370249914196__7961    DALTON                03.06.13 10:58   03.06.13 10:58   03.06.13 10:58   FINISHED          hydra     S P   

 

or detailed info of a given job:

$qcg-info J1370249571020__4545
...
Status: FINISHED

StatusDescription: 
SubmissionTime: Mon Jun 03 10:52:51 CEST 2013
FinishTime: Mon Jun 03 11:08:54 CEST 2013
...
 
The outputs can be found in the directory where you submitted your job, e.g.:
 
$tail outputs/J1370250220729__5859.output
Dalton finished
@ Excitation energy :  0.13670205    au
@ Excitation energy :  0.14338753    au
@ Excitation energy :  0.15690400    au
@ Excitation energy :  0.17246312    au
@ Excitation energy :  0.18941452    au
@ Excitation energy :  0.19843150    au
 >>>> Total CPU  time used in DALTON:   4 hours  8 minutes 29 seconds
 >>>> Total wall time used in DALTON:   4 hours  9 minutes 40 seconds
Simulation converged. Sending signal to stop

 

Other  tools

One can also submit simple batch jobs using desktop application called QCG-Icon.