Job Files, Quotas and Working Directories in Fox
A job typically uses several types of files, including:
- the job script itself
- the Slurm output file (default:
slurm-<job-ID>.out
) - input files
- temporary files
- output files
There are multiple choices for where to keep files.
Name | Path | Size | Description |
---|---|---|---|
Project area | /fp/projects01/<project-name> | quota per project | main project area, for permanent files |
Project work area | /cluster/work/projects/<project-name> | no quota | for temporary project files on Fox |
Job scratch area ($SCRATCH ) | /localscratch/<job-ID> | 3.5 TiB per node | a fast disk on the node where the job runs |
Each location has its advantages and disadvantages, depending on
usage. The parallel file system (project area and project work area)
is by nature slow for random read & write operations and metadata
operations (handling of large number of files). The local file system
($SCRATCH
) is far better suited for this. In addition the parallel
file system needs to serve all users, so placing very high metadata
load on it make the file system slow for all users. On the other
hand, the local file system is local to each compute node, and cannot
easily be shared between nodes (but see below).
Checking Quotas
Project quotas start at 1 TiB, while storage in $HOME
is capped at 50 GiB.
To check your quota usage, you may use df -h .
in your home and project directories.
Recommendations
We recommend that the job script itself and the Slurm output file
(slurm-<jobid>.log
) are kept in the project area. The default
location for the Slurm output file is the directory where one runs
sbatch
. You can also keep both of these files in your home
directory, but be aware that the disk quota for home directories is
quite small.
Input files
Where to keep input files depends on how they are used.
If an input file is read sequentially (i.e., from start to end), it is best to keep it in the project area.
If there is a lot of random read of an input file, it is best to let
the job script copy the file to $SCRATCH
.
Temporary files
By temporary files we mean files created by the job, and that are not needed after the job has finished.
Temporary files should normally be created in $SCRATCH
, since this
is the fastest disk. This is especially important if there is a lot
of random read and/or write of the files.
If other users need access to files while a job runs, or if you would like to keep the files after the job has finished, you should create files in the project work area. Files here can be made available to users in the same project.
Note! |
---|
Files in the project work area are deleted after 30 days. |
Output files
By output files we mean files created by the job, and that are needed after the job has finished.
As with input files, if an output file is written sequentially (i.e., from start to end), it is best to create it in the project area.
If there is a lot of random writes (or reads) of an output file, it
is best to create it in $SCRATCH
, and let the job script copy the
file to the project area when the job finishes. This can be done
with the savefile
command (see below).
Files in $SCRATCH
The $SCRATCH
area (/localscratch/<job-ID>
) for each job is created
automatically when the job starts, and deleted afterwards. It is
located on solid state storage (NVMe uisng PCIe) on the compute nodes.
Such memory based storage is magnitudes faster than normal disk
storage for random access operations. For streaming operations like
writing or reading large sequential amount of data the parallel file
system is comparable, even tape drives are comparable for streaming
data/sequential access.
A potential limitation of the scratch area is its limited size. As memory has higher cost than spinning disks, the scratch area is limited to 3.5 TiB for batch compute nodes and 7 TiB for the interactive nodes. This is shared between all jobs running in the node.
Files placed in $SCRATCH
will automatically be deleted
after the job finishes.
Output files
Output files can also be placed in $SCRATCH
for increased speed (see
above). In order to ensure that they are saved when the job finishes,
you can use the command savefile filename
in the job script, where filename
is the
name of the file, relative to the $SCRATCH
area. The command should
be placed before the main computational commands in the script.
I.e.,
savefile MyOuputFile
MyProgram > MyOutputFile
This ensures that the file /localscratch/<jobid>/MyOutputFile
is
copied back to the submit directory (the directory you were in when
you ran the sbatch
command). The file will be copied back even if
the job crashes (however, if the compute node itself crashes, the
file will not be copied back).
If you want more flexibility, it is possible to register a command to
be run to copy the file where you want it by using cleanup <commandline>
instead of using the savefile
command. It should
also be placed before the main computational commands. I.e.,
cleanup cp MyOutputFile /cluster/projects/ec<N>/mydir
MyProgram > MyOutputFile
Both commands should be used in the job script before starting the
main computation. Also, if they contain any special characters like
*
, these should be quoted.
Jobs using more than one node
As the $SCRATCH
area is local to each node, files cannot be shared
between nodes using $SCRATCH
. A job running on several nodes will
get one $SCRATCH
area on each node.
Slurm provide utilities for distributing files to local scratch areas on several nodes and gather files back again. Here is an example to illustrate how this might look:
#!/bin/bash
#SBATCH --account=ec11
#SBATCH --ntasks-per-node=2
#SBATCH --nodes=2
#SBATCH --mem-per-cpu=500M
#SBATCH --time=00:02:0
## Print the hostnames where each task is running:
srun hostname
## This copies "hello.c" from your submit dir to $SCRATCH on each node:
srun --ntasks-per-node=1 --ntasks=$SLURM_NNODES cp hello.c ${SCRATCH}/hello.c
## Simulate output files created on the $SCRATCH areas on each node
## by copying $SCRATCH/hello.c to $SCRATCH/bye.c once on each node:
srun --ntasks-per-node=1 --ntasks=$SLURM_NNODES cp ${SCRATCH}/hello.c ${SCRATCH}/bye.c
## This copies the "bye.c" files back to the submit dir:
sgather ${SCRATCH}/bye.c bye.c
Slurm sgather
will append $HOSTNAME
to each of the files gathered
to avoid overwriting anything. Note that you have to set up ssh keys
with an empty passphrase on Fox for sgather
to work, because under
the hood, it uses scp
to transfer the files.
(There is a slurm command sbcast
that can be used instead of srun --ntasks-per-node=1 --ntasks=$SLURM_NNODES cp
to copy files to
$SCRATCH
on each node, but it is much slower, and not suited to
large files.)
Files in project work directory
Each project has a project work directory, where one can store files
temporarily. The directory is /cluster/work/projects/ec<N>
. The
area is open for all users in the project, so it is possible to share
files within the project here.
It is recommended to create a subdirectory with your username in the work area and keep your files there. That reduces the likelihood of file name conflicts. It is also a good idea to set the permissions of this subdirectory to something restrictive unless you want to share your files with the rest of the project.
Old files are automatically deleted in the work area, so this is not
a place for permanent storage. Use the ordinary project area for that
(/fp/projects01/ec<N>
).
CC Attribution: This page is maintained by the University of Oslo IT FFU-BT group. It has either been modified from, or is a derivative of, "Job work directory" by NRIS under CC-BY-4.0. Changes: Major rewording and additions to all sections.