Using HPCs

14. Using HPCs#

High Performance Computing facilities enable you to perform resource-intensive processing and calculations that would be impossible on your local computing resources (laptops, desktops, etc). If you are analysing large datasets or performing tasks that require long periods of intensive calculation, they are likely useful for you to use.

HPCs are generally a shared resource that allow to pick how much compute resource you want to use (e.g. CPU, RAM, GPU) and for how long. You will then enter a queue to wait for access before your program is run. As a result, it is beneficial to know your program works at smaller scales before you waste time and compute resources making it work at a larger scale!

You will be interacting with the HPC (with few exceptions) exclusively via the command line. A tool known as ssh (Secure Shell) enable you to log in to a remote computer from your command line - you can think of this as a CLI version of a Remote Desktop. By default, you should be ssh-ing into a ‘Login Node’, which is a shared resource designed for you do low-intensity things like writing code or moving files about. The goal is then to set your code up such you can submit a program to be run on the HPC proper. This is typically done via a ‘workload manager’ such as SLURM. This is effectively a shell script of commands you want to run, headed by some instructions telling the manager how much resource you want and for how long.

If you are using large datasets, it is worth considering how you are going to move files to and from the HPC as well. Typically, you will have some sort of home directory with a limited amount of space (on the order of a few tens of gigs) that will be backed up regularly. It is advised that the things that go here are largely code rather than large datasets, and they can be kept consistent with your offline code using GitHub etc. You will then also generally have access to a much large amount of ‘scratch space’. This will not be backed up and may even be cleaned regularly. Nothing important should be here that isn’t backed up elsewhere, or that can’t be reproduced from code elsewhere. For instance, I might have a large amount of reference data in this space (velocity fields, BedMachine data, etc.) - but I have an associated shell script I use to download the data via wget/curl, so that if it is ever deleted I can reproduce the directory without any hassle!

Different Universities will have slightly different approaches to getting set up. This guide is currently being aimed at users in Durham, but for my own sanity I will, in seperate pages, notes about other institutions that I have used / am using.

Section Contents