Skip to content

🚧 [Environment Setup Problem] Installing TorchSparse with GPU Support Was Exhausting β€” Sharing My Full Debug Journey and Asking for Help πŸ™Β #11

@hippoley

Description

@hippoley

Hi everyone,

I recently tried to setup torchsparse configuration accroding the repo, and I want to share my experience setting it up β€” in case it helps others β€” and also to ask the community if anyone has a better way to handle this.

🧠 The Goal

I wanted to build and use torchsparse with GPU acceleration, because the project heavily relies on point cloud processing and performance is critical. However, I faced a long and frustrating journey of dependency hell, compilation errors, and CUDA version mismatches.


1️⃣ Initial Setup (That Failed)

My initial environment looked like this:

- OS: Ubuntu 22.04
- Python: 3.11
- PyTorch: 2.4.0
- CUDA: 12.1 (System-installed)

I tried installing torchsparse using:

pip install git+https://github.com/mit-han-lab/torchsparse.git

This failed with the following errors:

<command-line>: fatal error: cuda_runtime.h: No such file or directory
compilation terminated.

Digging into it further, I realized:

  • torchsparse officially supports only Python ≀ 3.10, PyTorch ≀ 2.0.x, and CUDA 11.x
  • My setup had newer versions across the board, leading to a chain of problems

2️⃣ Major Issues Encountered

πŸ”₯ 2.1 Compatibility Matrix Explosion

Component My Version Required by TorchSparse
Python 3.11 ≀ 3.10
PyTorch 2.4.0 1.13.1 – 2.0.x
CUDA 12.1 11.1 – 11.8
GCC 13.3.0 CUDA 11.8 only supports up to GCC 11

So, out of the box, nothing aligned with what torchsparse expected.


🧱 2.2 Missing CUDA Headers

The compiler failed to locate:

  • cuda_runtime.h
  • cuda_fp8.hpp

Despite having nvcc in the path, it turned out that:

  • My system CUDA was incorrectly linked
  • Conda environment CUDA paths weren’t being picked up by the build system

I later found that /usr/local/cuda/bin/nvcc was actually a symbolic link pointing into my conda environment, which further confused everything.


⚠️ 2.3 CUDA + GCC Incompatibility

Even after manually setting CUDA_HOME and CPATH, I was still getting macro-related compile errors like:

missing binary operator before token "("
(__CUDA_ARCH_HAS_FEATURE__(SM100_ALL)) || ...

These errors came from cuda_fp8.hpp, and were related to the __CUDA_ARCH_HAS_FEATURE__ macro not being defined.

Turns out: GCC 13 is not supported by CUDA 11.8, which torchsparse depends on.


πŸ› οΈ 3️⃣ What I Tried (and What Didn't Work)

βœ”οΈ I tried to patch the setup:

  • Added --allow-unsupported-compiler to setup.py
  • Manually created cuda_patch.h with missing macro definitions:
#ifndef __CUDA_ARCH_HAS_FEATURE__
#define __CUDA_ARCH_HAS_FEATURE__(x) 0
#endif

βœ”οΈ Installed GCC 11 using conda:

conda install -c conda-forge gcc_linux-64=11 gxx_linux-64=11

βœ”οΈ Set all CUDA environment variables:

export CUDA_HOME=/home/user/miniconda3/envs/scenescript
export CUDACXX=$CUDA_HOME/bin/nvcc
export CPATH=$CUDA_HOME/include:$CUDA_HOME/targets/x86_64-linux/include
export TORCH_CUDA_ARCH_LIST="7.5"

βœ”οΈ Tried building directly from source:

pip install -v -e .

Still failed. Tried switching PyTorch versions, adding missing headers, using legacy CUDA... Nothing worked cleanly.


βœ… 4️⃣ Final Working Solution: CPU-Only Mode

In the end, I gave up on GPU support and switched to CPU-only mode.

Here's how:

# Create compatible environment
conda create -n scenescript python=3.10 pytorch=2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia -y
conda activate scenescript

# Set CPU mode
export FORCE_CPU=1

# Install TorchSparse (CPU-only)
pip install git+https://github.com/mit-han-lab/torchsparse.git

This finally worked. It compiled the CPU backend only and skipped all CUDA files.


πŸ€• 5️⃣ Summary of Pain Points

Area Problem
Python version Too new
PyTorch version Incompatible with TorchSparse
CUDA version Too new, unsupported by TorchSparse
GCC version Too new for CUDA 11.8
Header files Missing CUDA headers, macro errors
Symbolic links Confusion caused by /usr/local/cuda/bin/nvcc pointing to conda binary
Setup script Needed multiple modifications + patch headers
Compilation Dozens of failed attempts, long logs

πŸ’¬ 6️⃣ Question to the Community

I’m now running TorchSparse in CPU mode just to get things going, but obviously I’d prefer GPU acceleration.

Has anyone found:

  • A clean environment configuration that works out-of-the-box for GPU-enabled TorchSparse?
  • A Dockerfile or prebuilt conda env that sets this up correctly?
  • A way to use TorchSparse with newer CUDA (12.x) or PyTorch 2.1+?

Even partial success stories are welcome πŸ™


πŸ“Ž Notes & Tips for Others

  • Try to match PyTorch and CUDA versions exactly as expected by the library
  • Check GCC version compatibility if building from source
  • Avoid mixing system and conda-level CUDA paths
  • If all else fails, CPU mode does work for now

Thank you in advance to anyone who reads this or shares tips. I hope this post can also help others who are banging their head against this build setup like I was!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions