HPC Vega uses and integrates technologies for data storage with different characteristics, which offer a variety of different file systems. These provide SLING and EuroHPC users with storage capacity and predefined directories.
|exa5||High Performance Storage||Lustre|
|ceph||Large Capacity Storage||Ceph|
High Performance Storage - Lustre
HPC Vega use Lustre as the parallel distributed file system. Lustre is designed for efficient parallel I/O for large files. Therefore, we recommend to users the following:
Avoid saving a large number of files in a single directory, better to split in more directories.
Avoid accessing a large number of small files on Lustre.
Make sure that the stripe count for small files is 1.
Avoid using standard Linux commands
lfs find. Below is an examples of use:
Search the Directory Tree
lfs find command searches the directory tree rooted at the given directory / filename for files that match the specified parameters. To review a list of all the options you may use with
lfs find, execute
lfs find help or
Note that it is usually more efficient to use
lfs find rather than use
find when searching for files on Lustre.
Some of the most commonly used
lfs find options are:
|–atime||File was last accessed N*24 hours ago. (There is no guarantee that atime is kept coherent across the cluster.)|
|–mtime||File status was last modified N*24 hours ago.|
|–ctime||File status was last changed N*24 hours ago.|
|–maxdepth||Limits find to descend at most N levels of the directory tree.|
|–print /–print0||Prints the full filename, followed by a new line or NULL character correspondingly.|
|–size||File has a size in bytes or kilo-, Mega-, Giga-, Tera-, Peta- or Exabytes if a suffix is given.|
|–type||File has the type (block, character, directory, pipe, file, symlink, socket or Door [Solaris]).|
|–gid||File has a specific group ID.|
|–group||File belongs to a specific group (numeric group ID allowed).|
|–uid||File has a specific numeric user ID.|
|–user||File is owned by a specific user (numeric user ID is allowed).|
$ lfs find /exa5/scratch/user/ -mtime +20 -type f -print
lfs find is used to identify files more than 20 days old.
For more options, please review the man page for the
Check Disk Space Usage
lfs df command displays the file system disk space usage. Additional parameters can be specified to display inode usage of each MDT/OST or a subset of OSTs. The usage for the
lfs df command is:
lfs df [-i] [-h] [path]
-i: Lists inode usage per OST and MDT.
-h: Output is printed in human-readable format, using SI base-2 suffixes for Mega-, Giga-, Tera-, Peta-, or Exabytes.
By default, the usage of all mounted Lustre file systems is displayed. Otherwise, if a path is specified the usage of the specified file system is displayed.
- Instead of
munlink, a Lustre-specific command that will simply delete a file. Below is an example of our recommended approach which consists of two steps.
The first step deletes all the files and soft links within a directory (and its subdirectories) with the use of munlink:
find -P ./mydir -print0 -type f -o -type l | xargs -0 munlink
Here is an overview of each step in that command:
findis a command that will search the indicated directory (and subdirectories within). The syntax defines a search for files and soft links.
-Pthis option restricts the search within the indicated directory tree and forces NO dereference of symbolic links. This warranties that the find command will not look for files within the links.
./mydirthis argument is the directory from which the search will start
print0this option indicates the format of the result of the "find" command. This particular format is able to catch strange file names, and ensures that they are readable for the following command (xargs) which has been concatenated with the pipe.
type f -o -type lthese options indicate that the find command will search for anything that is a file (-type f) or (-o) a soft link (-type l, this is the lower letter l).
The Pipe command ( represented by a single pipe line: |) concatenates two commands. This makes the output of the previous command "find" to serve as input to the following command (xargs in this case).
xargs -0xargs will then convert the received list of files, line by line, into an argument for whatever command is specified at the end (in this case: munlink). The -0 flag is related to the format of the listed files; if you use -print0 in the find command you must use -0 in the xargs command.
munlinkdeletes each file and soft link in the list without overloading the metadata server. In this case, the list is the one received by xargs.
The second step is to remove the empty directories and subdirectories in the tree. Once all of the files and soft links were deleted, you can remove the empty directories with a similar command:
find -P ./mydir -type d -empty -delete
The following file systems are available for users:
|tmp||/tmp||Local temporary directory on compute nodes for all users|
|home||/ceph/hpc/home/$USER||Home directory for users on a shared file system mounted on compute and login nodes|
|scratch||/exa5/scratch/user/$USER||Scratch directory on the High Performance Storage – Lustre – on request mounted on compute and login nodes|
|scratch||/ceph/hpc/scratch/user/$USER||Scratch directory on the Large Capacity Storage – Ceph – on request mounted on compute and login nodes|
|Local scratch||/scratch/slurm/jobid||Temporary node-local directory for jobs (created automatically by Slurm prolog script and deleted after the job is finished)|
Home Directory is user's home directory with a default quota of 100 GB . All home directories are on a shared file system and provide internal snapshots (backup) which are stored for 30 days.
On the High Performance Storage – Lustre /exa5/scratch/user/
On the Large Capacity Storage – Ceph /ceph/hpc/scratch/user/
These directories are created upon a user request and are recommend for storing large amounts of non-persistent data. Files in these file systems are removed after one month, so please copy any data you want to keep to your home directory. Quotas are set in the scratch directory. Detailed information for quotas is available at this link.
Users should check their disk space quota by using the following commands:
For the usage on Lustre file system:
lfs quota -h -u $USER /exa5/scratch/user/$USER/
For the usage on Ceph file system:
getfattr -n ceph.quota.max_bytes /ceph/hpc/home/$USER
If you need permanent storage of large amounts of data, please contact the support team: email@example.com
|home||100GB||Size of home directory for each user|
|scratch||20GB||Size of scratched directory for each user|
For additional space, please contact the support team: firstname.lastname@example.org