In this section

Department of Computing GPU Cluster Guide

Getting Started

Please note: this service is intended primarily for supporting coursework and individual projects for taught programmes in the Department of Computing. Researchers and members of other departments may want to consult the Research Computing Services (RCS) for college-provided compute resources.

Update 27/9/2024
Ubuntu 24.04 upgrades are complete, please create new 24.04 compatible Python virtual environments (links in following steps) using a lab PC

What is Slurm and the GPU Cluster?

Slurm is a Linux open-source task scheduling system for managing compute resources, in this case, the department's GPU resources.

Using Slurm commands such as 'sbatch' and 'salloc', your scripts (such as CUDA-based parallel computing - deep-learning, machine-learning and large language models (LLMs), using frameworks such as PyTorch and Tensorflow, or Jax, among others) are executed on our pool of NVIDIA GPU Linux servers.

Read this guide to learn how to:

connect to the submission host server and submit a test script
start an interactive job (connect directly to a GPU exclusively for a time limit)
compose a shell script that uses shared storage, a python environment, CUDA and your python scripts

Before you start

Some familiarity with Department of Computing systems is desirable before using the GPU cluster:

logging in to DoC Lab PCs, especially Nvidia GPU-equipped PCs (Doc Lab PCs)
remotely connecting to lab PCs and Doc Shell servers from a Linux/Mac/Windows Terminal (Shell server guide)
composing bash scripts (examples are provided below - it is beyond the scope of this guide to explain shell scripting)
python environments (Python environments guide)
Linux command line interface (Terminal, CLI)

Tip: make sure you have tested your python scripts on your own device or a Doc Lab PC with GPU before using the GPU cluster. Prior testing will help flag errors with your scripts before using sbatch

Follow Nuri's guide for an introduction to using Linux in the Department of Computing

Department of Computing GPU Cluster Guide

Getting Started

What is Slurm and the GPU Cluster?

Before you start

Faculty of Engineering

Get in touch

Quick links

Find us on social media

Department of Computing GPU Cluster Guide

Getting Started

Introduction

What is Slurm and the GPU Cluster?

Before you start

Step by step

General Comments

Faculty of Engineering

Get in touch

Quick links

Find us on social media