-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Hi everyone,
I recently tried to setup torchsparse configuration accroding the repo, and I want to share my experience setting it up β in case it helps others β and also to ask the community if anyone has a better way to handle this.
π§ The Goal
I wanted to build and use torchsparse with GPU acceleration, because the project heavily relies on point cloud processing and performance is critical. However, I faced a long and frustrating journey of dependency hell, compilation errors, and CUDA version mismatches.
1οΈβ£ Initial Setup (That Failed)
My initial environment looked like this:
- OS: Ubuntu 22.04
- Python: 3.11
- PyTorch: 2.4.0
- CUDA: 12.1 (System-installed)I tried installing torchsparse using:
pip install git+https://github.com/mit-han-lab/torchsparse.gitThis failed with the following errors:
<command-line>: fatal error: cuda_runtime.h: No such file or directory
compilation terminated.Digging into it further, I realized:
torchsparseofficially supports only Python β€ 3.10, PyTorch β€ 2.0.x, and CUDA 11.x- My setup had newer versions across the board, leading to a chain of problems
2οΈβ£ Major Issues Encountered
π₯ 2.1 Compatibility Matrix Explosion
| Component | My Version | Required by TorchSparse |
|---|---|---|
| Python | 3.11 | β€ 3.10 |
| PyTorch | 2.4.0 | 1.13.1 β 2.0.x |
| CUDA | 12.1 | 11.1 β 11.8 |
| GCC | 13.3.0 | CUDA 11.8 only supports up to GCC 11 |
So, out of the box, nothing aligned with what torchsparse expected.
π§± 2.2 Missing CUDA Headers
The compiler failed to locate:
cuda_runtime.hcuda_fp8.hpp
Despite having nvcc in the path, it turned out that:
- My system CUDA was incorrectly linked
- Conda environment CUDA paths werenβt being picked up by the build system
I later found that /usr/local/cuda/bin/nvcc was actually a symbolic link pointing into my conda environment, which further confused everything.
β οΈ 2.3 CUDA + GCC Incompatibility
Even after manually setting CUDA_HOME and CPATH, I was still getting macro-related compile errors like:
missing binary operator before token "("
(__CUDA_ARCH_HAS_FEATURE__(SM100_ALL)) || ...
These errors came from cuda_fp8.hpp, and were related to the __CUDA_ARCH_HAS_FEATURE__ macro not being defined.
Turns out: GCC 13 is not supported by CUDA 11.8, which torchsparse depends on.
π οΈ 3οΈβ£ What I Tried (and What Didn't Work)
βοΈ I tried to patch the setup:
- Added
--allow-unsupported-compilertosetup.py - Manually created
cuda_patch.hwith missing macro definitions:
#ifndef __CUDA_ARCH_HAS_FEATURE__
#define __CUDA_ARCH_HAS_FEATURE__(x) 0
#endifβοΈ Installed GCC 11 using conda:
conda install -c conda-forge gcc_linux-64=11 gxx_linux-64=11βοΈ Set all CUDA environment variables:
export CUDA_HOME=/home/user/miniconda3/envs/scenescript
export CUDACXX=$CUDA_HOME/bin/nvcc
export CPATH=$CUDA_HOME/include:$CUDA_HOME/targets/x86_64-linux/include
export TORCH_CUDA_ARCH_LIST="7.5"βοΈ Tried building directly from source:
pip install -v -e .Still failed. Tried switching PyTorch versions, adding missing headers, using legacy CUDA... Nothing worked cleanly.
β 4οΈβ£ Final Working Solution: CPU-Only Mode
In the end, I gave up on GPU support and switched to CPU-only mode.
Here's how:
# Create compatible environment
conda create -n scenescript python=3.10 pytorch=2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia -y
conda activate scenescript
# Set CPU mode
export FORCE_CPU=1
# Install TorchSparse (CPU-only)
pip install git+https://github.com/mit-han-lab/torchsparse.gitThis finally worked. It compiled the CPU backend only and skipped all CUDA files.
π€ 5οΈβ£ Summary of Pain Points
| Area | Problem |
|---|---|
| Python version | Too new |
| PyTorch version | Incompatible with TorchSparse |
| CUDA version | Too new, unsupported by TorchSparse |
| GCC version | Too new for CUDA 11.8 |
| Header files | Missing CUDA headers, macro errors |
| Symbolic links | Confusion caused by /usr/local/cuda/bin/nvcc pointing to conda binary |
| Setup script | Needed multiple modifications + patch headers |
| Compilation | Dozens of failed attempts, long logs |
π¬ 6οΈβ£ Question to the Community
Iβm now running TorchSparse in CPU mode just to get things going, but obviously Iβd prefer GPU acceleration.
Has anyone found:
- A clean environment configuration that works out-of-the-box for GPU-enabled TorchSparse?
- A Dockerfile or prebuilt conda env that sets this up correctly?
- A way to use TorchSparse with newer CUDA (12.x) or PyTorch 2.1+?
Even partial success stories are welcome π
π Notes & Tips for Others
- Try to match PyTorch and CUDA versions exactly as expected by the library
- Check GCC version compatibility if building from source
- Avoid mixing system and conda-level CUDA paths
- If all else fails, CPU mode does work for now
Thank you in advance to anyone who reads this or shares tips. I hope this post can also help others who are banging their head against this build setup like I was!