All example files are located in the folder /mnt/big/examples.
In this example, resources of at least one core (ntask) on one node on the batch partition are requested. The command under execution just writes the hostname of the computing node in a file ‘hostname.txt’ in the home directory of the user. So as the script is executed users should copy the file in their home directory and run it from there. By opening hostname.txt one can reveal the node where slurm manager allocated the job. If only the ‘hostname’ is executed (that is ‘>>~/hostname.txt’ is neglected, then the stdout is written to the example1.output file)
#!/bin/bash #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --mem-per-cpu=4000 #SBATCH --job-name="example1" #SBATCH --output=example1.output #SBATCH --firstname.lastname@example.org #SBATCH --mail-type=ALL hostname>>~/hostname.txt
In this example, one node in the fast partition is allocated and the sleep command is executed. Sleep sets a core at the idle mode for 100 seconds and the result of the
time command is written to the example2.output. Users should copy example2.sh to their home dir so as they can run it. With squeue command, one could monitor the job execution in the slurm job queue.
#!/bin/bash #SBATCH --partition=fast #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --mem-per-cpu=4000 #SBATCH --job-name="example2" #SBATCH --output=example2.output #SBATCH --email@example.com #SBATCH --mail-type=ALL time sleep 100
The same as example2 but with some more commands and use of some environmental variables.
#!/bin/bash #SBATCH --partition=fast #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --mem-per-cpu=4000 #SBATCH --job-name="example3" #SBATCH --output=example3.output #SBATCH --firstname.lastname@example.org #SBATCH --mail-type=ALL echo $SLURM_JOB_NAME echo $SLURM_JOB_PARTITION echo $SLURMD_NODENAME echo $SLURM_JOB_ID echo $SLURM_JOB_CPUS_PER_NODE echo "Job Start: `date`" time sleep 50 echo "Job End: `date`"
In this example, a multiple alignment program for amino acid or nucleotide sequences, called mafft, is illustrated. Please copy it in your home directory along with the input data file `dataset_2.fa ` and change the input/output paths properly.
#!/bin/bash #this is an example of Haris Zafiropoulos #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --mem=20000 #SBATCH --job-name="align_batch_threadn" #SBATCH --output=align_batch.output #SBATCH --mail-user= #SBATCH --mail-type=ALL start=$SECONDS /usr/bin/mafft --thread 4 --globalpair --maxiterate 16 --reorder /mnt/big/examples/example4/dataset_2.fa > ~/student2_align.aln duration=$(( SECONDS - start )) echo "duration is : $duration "
The environment variable $SECONDS is used for the total time of execution to be estimated. The variable $duration is printed to the `align_batch.output` file. The program mafft is called with some options and the output is written to the file `student2_align.aln`. The input is the `dataset_2.fa` file. For more details about the options `–globalpair`, `–maxiterate` and `–reorder` please refer to the manual page of the command. A key option for the minimization of the script’ s total execution time is the `thread` parameter. In the following table, someone can infer the effect of parallelization.
|Scenario||Num of Threads||Total Time|
* according to the manual of mafft when thread is minus one, then the total number of node cores is automatically counted.