Skip to content

Field Report: MuseTalk V1.5 working on RTX 5060 Ti (Blackwell sm_120) with Python 3.12 + mediapipe patch #409

@chefboyrdave21

Description

@chefboyrdave21

Summary

Got MuseTalk V1.5 running end-to-end on an RTX 5060 Ti (Blackwell, sm_120, 16GB VRAM) with Python 3.12 on Ubuntu 24.04. Sharing findings for the community since Blackwell GPU + Python 3.12 is a common setup that currently doesn't work out of the box.

Two Issues & Solutions

1. PyTorch + Blackwell (sm_120)

MuseTalk recommends PyTorch 2.0.1+cu118, but Blackwell GPUs need cu128+ for native sm_120 kernel support.

PyTorch CUDA sm_120?
2.6.0+cu124 12.4 no kernel image errors
2.10.0+cu126 12.6 ❌ Same errors
2.10.0+cu128 12.8 Works!
pip install torch==2.10.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

2. mmpose/mmcv on Python 3.12

mmcv has no pre-built wheels for Python 3.12 on any CUDA index. Building from source fails due to pkg_resources removal in Python 3.12.

Workaround: Replace mmpose with mediapipe (Tasks API) + face_alignment in musetalk/utils/preprocessing.py:

  • pip install mediapipe face-alignment
  • Use mediapipe's 478-point face mesh instead of mmpose's wholebody model
  • Map mediapipe landmarks → MuseTalk's nose bridge indices (28-30):
    • face_lm[28] = pts_478[6] (nose bridge top)
    • face_lm[29] = pts_478[197] (nose bridge mid)
    • face_lm[30] = pts_478[195] (nose bridge lower)
  • Use fa.get_landmarks_from_image() instead of deprecated get_detections_for_batch()

Performance

  • 7sec audio → 30sec inference → MP4 output
  • ~3.6GB VRAM for MuseTalk models
  • Reference image face detection cached after first call

Suggestion

Consider adding mediapipe as an alternative preprocessing backend for users on Python 3.12+ where mmpose isn't installable. Happy to contribute a PR if there's interest.

Setup: RTX 5060 Ti 16GB | Ubuntu 24.04 | Python 3.12.3 | PyTorch 2.10.0+cu128 | MuseTalk V1.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions