GPU Instances

Red Cloud supports GPU computing featuring Nvidia Tesla T4 and Nvidia Tesla V100 GPUs. To use a GPU, launch an instance with one of the following two flavors (instance types):

Flavor CPUs RAM GPUs
c8.t1.m90 8 90 GB 1 Nvidia Tesla T4
c14.g1.m60 14 60 GB 1 Nvidia Tesla V100

If you are new to Red Cloud please review this documentation before launching an instance.

Availability

Red Cloud resources (CPU cores, RAM, GPUs) are not oversubscribed. When you create a GPU instance, you are reserving the physical hardware for the duration of the life of your instance (and your subscription will be charged accordingly) until the instance is deleted or shelved to free the resources.

Launching a GPU Instance (with a Prebuilt Image)

When launching a GPU instance, you can use the base Linux or Windows image and install a suitable NVIDIA driver and your own software or libraries that utilize the GPU. To speed up time to science, CAC also provides a Linux GPU image with GPU software installed. We highly recommend you to follow the Linux or Windows instructions to get started.

GPU images rocky-9.4-cuda-12.6

This image includes the following software:

  1. CUDA 12.6 (/usr/local/cuda-12.6/)
  2. Python 3.9.18 (/usr/bin/python or /usr/bin/python3)

Note: this Python does not come with pip, you will have to first run the command python -m ensurepip before you can run pip with python -m pip ....

Installed Software: CUDA

  • Check CUDA version currently in use with nvcc --version (or where nvcc is located)
    [rocky@gpu-instance ~]$ /usr/local/cuda-12.6/bin/nvcc --version
        nvcc: NVIDIA (R) Cuda compiler driver
        Copyright (c) 2005-2024 NVIDIA Corporation
        Built on Thu_Sep_12_02:18:05_PDT_2024
        Cuda compilation tools, release 12.6, V12.6.77
        Build cuda_12.6.r12.6/compiler.34841621_0

Installed Software: NVIDIA Driver

  • Check version with dkms status
  • Check detection of GPU devices by CUDA via NVIDIA’s drivers: nvidia-smi

    [rocky@gpu-instance ~]$ nvidia-smi
        +-----------------------------------------------------------------------------------------+
        | NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
        |-----------------------------------------+------------------------+----------------------+
        | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
        | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
        |                                         |                        |               MIG M. |
        |=========================================+========================+======================|
        |   0  Tesla T4                       Off |   00000000:06:00.0 Off |                    0 |
        | N/A   28C    P8              9W /   70W |       1MiB /  15360MiB |      0%      Default |
        |                                         |                        |                  N/A |
        +-----------------------------------------+------------------------+----------------------+

    +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+

Note: the CUDA version displayed on the top right of this output is not necessarily the CUDA version currently in use.

Using GPUs on Any Instance

To use GPUs on your Windows or Linux instance, you need to resize to a flavor with GPU and install NVIDIA drivers. Note that resizing restarts an instance, so be sure to save your work first.

Installing NVIDIA Drivers

On Ubuntu, to install NVIDIA Drivers with CUDA 12.2:

sudo apt install libnvidia-common-535 libnvidia-gl-535 nvidia-driver-535
sudo reboot now

Use nvidia-smi to check if NVIDIA driver is installed correctly and detected the right GPU. You should see a table like the one above.

Best Practices

Python Virtual Environment

Python provides a method to create virtual environments, which are useful to manage your Python packages across different projects. The instructions to set up virtual environments are located here.

Miniforge

While Python virtual environments are great for managing Python packages, you might find yourself installing packages that are not from PyPI. For system packages, we recommend using Miniforge. Miniforge is similar to Miniconda, except that in Miniforge, packages are only installed from conda-forge. Whereas Miniconda installs packages from the default Anaconda channel.

Both Linux and Windows installation methods are derived from miniforge GitHub.

Install on Windows Instance

Download and execute the Windows installer.

Install on Linux Instance

Run the following:

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
./miniforge3/bin/conda init

Exit and re-enter your shell, and you should now have access to conda and mamba