Make device interface generic #606

dvrogozh · 2025-03-28T23:27:28Z

Fixes: #605

Changes:

Device interface made device agnostic by intorducing class DeviceInterface from which specific backends should inherit their device specific implementations
Implemented CudaDevice derived from DeviceInterface
Created device interface registration mechanism (registerDeviceInterface)
Created device interface creation mechanism (createDeviceInterface)

These changes allow to replace CUDA specific code in VideoDecoder.cpp and VideoDecoderOps.cpp by device agnostic code.

CC: @scotts, @NicolasHug

dvrogozh · 2025-04-01T15:04:41Z

Caught by macos ci:

In file included from /Users/ec2-user/runner/_work/torchcodec/torchcodec/pytorch/torchcodec/src/torchcodec/decoders/_core/VideoDecoder.cpp:16:
/Users/ec2-user/runner/_work/torchcodec/torchcodec/pytorch/torchcodec/src/torchcodec/decoders/_core/./../../../../src/torchcodec/decoders/_core/DeviceInterface.h:27:1: error: 'DeviceInterface' defined as a struct here but previously declared as a class; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
struct DeviceInterface {
^
/Users/ec2-user/runner/_work/torchcodec/torchcodec/pytorch/torchcodec/src/torchcodec/decoders/_core/./../../../../src/torchcodec/decoders/_core/VideoDecoder.h:20:1: note: did you mean struct here?
class DeviceInterface;
^~~~~
struct

Plus there is a conflict with recent commits in VideoDecoderOps.cpp to resolve. Will do today. @scotts : let me know if there are other changes you want me to make.

src/torchcodec/decoders/_core/DeviceInterface.h

scotts · 2025-04-01T17:55:24Z

src/torchcodec/decoders/_core/VideoDecoder.h

@@ -138,7 +140,7 @@ class VideoDecoder {
    std::optional<int> height;
    std::optional<ColorConversionLibrary> colorConversionLibrary;
    // By default we use CPU for decoding for both C++ and python users.
-    torch::Device device = torch::kCPU;
+    std::shared_ptr<DeviceInterface> device;


The DeviceInterface object should be a required member of VideoDecoder. It should also be a unique_ptr, as the VideoDecoder should be the only owner. At VideoDecoder construction, it will be empty, and it will only take on a value after we've added a stream.

Some video clips might contain multiple video streams (multiple audio streams, subtitles, etc.). As of now DeviceInterface is associated with a single stream and allows to potentially process different streams on different accelerators. Did you consider to support such scenario in the future? /We still can handle that even we will move DeviceInterface to VideoDecoder storing it in a vector or map indexed by streams. That is design decision to make./

The current C++ decoder was originally implemented to support decoding from multiple streams at once, hence why we have methods for adding a stream and tracking info per-stream. But, that functionality never actually worked correctly. Getting it to work correctly would have required a lot more changes and testing, and we only expose single-stream decoding in our public API. So we instead decided to just make it explicit: the C++ decoder only works on one stream at a time. That's more explicit now in the new name.

We actually have a TODO to move some of the current fields of StreamInfo into SingleStreamDecoder itself for this reason. The device interface is another field that belongs in SingleStreamDecoder itself rather than in StreamInfo. In general, we're trying to keep the code as simple as possible, which means not keeping things around that we "might need" in the future. If we need it in the future, we'll implement it then. And if we want to implement mulistream decoding, we may end up implementing an entirely new C++ class.

src/torchcodec/decoders/_core/DeviceInterface.cpp

src/torchcodec/decoders/_core/VideoDecoder.cpp

scotts

@dvrogozh, this is great progress! I'd also like to see all of the VideoDecoder::*OnCPU() functions moved into a CpuDevice class. A bunch of the initialize and release member functions will be no-ops, and that's fine. That way it's clear what is device-agnostic code and what is device-specific code. I think that will also clean-up the dispatch logic inside of VideoDecoder.

dvrogozh · 2025-04-01T20:44:05Z

I'd also like to see all of the VideoDecoder::*OnCPU() functions moved into a CpuDevice class.

@scotts : I thought to do that in a follow up PR. Do you want me to do that in a single shot now?

dvrogozh · 2025-04-02T16:48:33Z

Rebased on top of merged #609 and #611.

Fixes: pytorch#605 Changes: * Device interface made device agnostic by intorducing `class DeviceInterface` from which specific backends should inherit their device specific implementations * Implemented `CudaDevice` derived from `DeviceInterface` * Created device interface registration mechanism (`registerDeviceInterface`) * Created device interface creation mechanism (`createDeviceInterface`) These changes allow to replace CUDA specific code in `VideoDecoder.cpp` and `VideoDecoderOps.cpp` by device agnostic code. Signed-off-by: Dmitry Rogozhkin <[email protected]>

dvrogozh · 2025-04-03T18:16:18Z

@scotts : can you, please, review again?

src/torchcodec/_core/CudaDevice.cpp

src/torchcodec/_core/DeviceInterface.cpp

src/torchcodec/_core/SingleStreamDecoder.h

scotts · 2025-04-04T16:18:59Z

@dvrogozh, this is looking great! Let's make that one change to where the device lives, and also figure out why the C++ tests are failing.

Signed-off-by: Dmitry Rogozhkin <[email protected]>

scotts · 2025-04-04T16:25:04Z

Also: making a proper CPU device is fine for a follow-up PR, there's a bunch of changes already in this one, and it's self-consistent.

Signed-off-by: Dmitry Rogozhkin <[email protected]>

dvrogozh · 2025-04-04T17:10:06Z

Let's make that one change to where the device lives

@scotts : done

and also figure out why the C++ tests are failing.

/usr/include/c++/13/bits/unique_ptr.h:97:23: error: invalid application of ‘sizeof’ to incomplete type ‘facebook::torchcodec::DeviceInterface’

@scotts : C++ tests were failing with quite cryptic manner. See error above. I've fixed it by including DeviceInterface.h header file in the test's .cpp file. Let me pay your attention on one thing here. There is a circular dependency at the moment - DeviceInterface.h needs some definitions from SingleStreamDecoder.h, namely SingleStreamDecoder::VideoStreamOptions and SingleStreamDecoder::FrameOutput and SingleStreamDecoder.h needs DeviceInterface. Currently I resolved it in a way that DeviceInterface.h explicitly includes SingleStreamDecoder.h and the latter just has class DeviceInterface forward declaration. This apparently was not enough for the test - it wanted to see full definition of DeviceInterface (I think due to implementaiton details of std::unique_ptr, I did not see this issue with std::shared_ptr - I believe I tried C++ test with it).

I think we can consider to extract SingleStreamDecoder::VideoStreamOptions and SingleStreamDecoder::FrameOutput to separate header file (or move to one of existing common files) and break above circular dependency. I would consider that for the follow up PR though.

scotts · 2025-04-04T19:18:48Z

src/torchcodec/_core/SingleStreamDecoder.h

@@ -493,6 +492,7 @@ class SingleStreamDecoder {
  SeekMode seekMode_;
  ContainerMetadata containerMetadata_;
  UniqueDecodingAVFormatContext formatContext_;
+  std::unique_ptr<DeviceInterface> deviceInterface;


Class variable members end with an underscore (foo_).

ah, I overlooked moving it from the struct. Will fix in a follow up PR.

scotts · 2025-04-04T19:25:26Z

Agreed on moving SingleStreamDecoder::VideoStreamOptions and SingleStreamDecoder::FrameOutput in separate, new headers in a follow-up PR. Let's put VideoStreamOptions in its own header, and then the entire family of FrameOutput, FrameBatchOutput and AudioFramesOutput into one new header.

This looks great, thanks for making TorchCodec more general for devices!

scotts · 2025-04-04T19:26:19Z

@dvrogozh, I'm merging as-is, you can address the class member name in the follow-up PRs.

dvrogozh · 2025-04-04T20:07:16Z

Agreed on moving SingleStreamDecoder::VideoStreamOptions <...>

Filed to track:

Move stream options and frame output structs to dedicated header(s) #618

Making a proper CPU device is fine for a follow-up PR

Filed to track:

Define CpuDevice class and move CPU specific methods (color conversion) to it #619

NicolasHug · 2025-04-07T09:47:15Z

src/torchcodec/_core/CudaDevice.h

+
+namespace facebook::torchcodec {
+
+class CudaDevice : public DeviceInterface {


The term "device" should be reserved for the concept of pytorch device object. Here, the CudaDevice class name is misleading because it doesn't refer to a device, it refers to a device interface. @dvrogozh , do you mind submitting a PR to rename CudaDevice into CudaDeviceInterface (file names and file classes)?

@NicolasHug : no problem, will do.

@NicolasHug : filed #626

Renaming as requested in: * pytorch#606 (comment) Signed-off-by: Dmitry Rogozhkin <[email protected]>

…ec][diff_train] Make device interface generic (pytorch#606)" for one test failure Summary: This diff reverts D72722867 D72475332: [torchcodec][diff_train] Make device interface generic (pytorch#606) by generatedunixname499836121 causes the following test failure: Tests affected: - [cogwheel:cogwheel_fblearner_inferno_hello_world#main](https://www.internalfb.com/intern/test/844425115815788/) Here's the Multisect link: https://www.internalfb.com/multisect/25810176 Here are the tasks that are relevant to this breakage: T191383700: 100+ tests, 10+ build rules, some CI signals unhealthy for offline_inference The backout may land if someone accepts it. If this diff has been generated in error, you can Commandeer and Abandon it. Reviewed By: scotts Differential Revision: D72775487

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 28, 2025

dvrogozh mentioned this pull request Mar 28, 2025

[RFC] Enable Intel GPU support in torchcodec (pytorch xpu backend device) #559

Closed

NicolasHug mentioned this pull request Apr 1, 2025

Move torchcodec/decoders/_core into torchcodec/_core #609

Merged

dvrogozh force-pushed the device-interface branch from 58b55e3 to 89df863 Compare April 1, 2025 17:37