3. Troubleshooting

3.1. CUDA Errors

The most common type of error observed when running BRAILS is a CUDA runtime error. If you see an error message similar to the one below when BRAILS attempts to run a deep learning model, PyTorch detected an incompatibility between the PyTorch and CUDA versions installed on your system.

RuntimeError: CUDA error: no kernel image is available for execution on the device

Installation instructions install the latest PyTorch version and the associated default CUDA version by default. If your computer is equipped with a graphics card (GPU) that does not support this default CUDA installation, you will need to reinstall PyTorch following the steps below.

  1. Uninstall PyTorch and Torchvision:

    pip uninstall torch torchvision
    
  2. Identify the highest CUDA version compatible with your GPU driver using:

    nvidia-smi
    

    For example, in the sample output shown below, the GPU driver installed on this system supports CUDA versions up to 11.6 (highlighted in the red box).

A sample output obtained after running the nvidia-smi command
  1. Determine the command for the PyTorch installation compatible with your CUDA version using the PyTorch website by selecting the operating system (OS) and CUDA version (as determined in Step 2) for your system. Please make sure to set the Package type and Language to pip and Python, respectively.

    The following example shows the command for installing PyTorch on a Linux system supporting CUDA versions up to 11.6.

Determining the pip install command for a Linux system supporting CUDA versions up to 11.6 using PyTorch website
  1. Run the pip install command determined in Step 3. Please note that torchaudio is not required for running BRAILS, hence can be removed from this command.

3.2. API Key Errors

The trained models and accompanying datasets, when called the first time, need to be downloaded from the internet. Images also need to be downloaded during the running. Therefore, please make sure you are connected to the internet.