Add TransNetV2ClipExtractionStage to video splitting pipeline #809

suiyoubi · 2025-07-16T15:50:25Z

Replaced the FixedStrideExtractorStage with TransNetV2ClipExtractionStage in the video split clip example to enhance clip extraction capabilities using the TransNetV2 model.
Updated command-line arguments to include parameters specific to the TransNetV2 stage, such as thresholds and clip length settings.
Introduced a new models package and added the TransNetV2 model implementation for shot transition detection.

Description

Usage

# Add snippet demonstrating usage

Checklist

I am familiar with the Contributing Guide.
New or Existing tests cover these changes.
The documentation is up to date with these changes.

…ding stages - Introduced `video_split_clip_example.py` to demonstrate video splitting functionality. - Added `ClipTranscodingStage` and `FixedStrideExtractorStage` for processing video clips. - Implemented command-line arguments for configuring video processing parameters. - Created utility functions for grouping iterables in `grouping.py`. - Added unit tests for the new stages in `test_clip_transcoding_stage.py` and `test_fixed_stride_extractor_stage.py`. Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

…adStage to accept VideoTask. Enhance video reading capabilities with new tests for VideoReaderStage. Signed-off-by: Ao Tang <[email protected]>

…_read_example to include verbose argument. Signed-off-by: Ao Tang <[email protected]>

… additional metadata fields. Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

- Introduced a new test package for tasks with an initial test suite for the video tasks module, including tests for the Clip, ClipStats, Video, VideoMetadata, and VideoTask classes. - Implemented various test cases to validate initialization, property calculations, metadata extraction, and size calculations. This enhances the testing coverage for video-related functionalities in the ray-curator project. Signed-off-by: Ao Tang <[email protected]>

- Expanded the test suite for the video tasks module by adding new test cases for the Clip, ClipStats, Video, VideoMetadata, and VideoTask classes. - Improved coverage for various functionalities including initialization, property calculations, and metadata extraction. This update strengthens the reliability of video-related features in the ray-curator project. Signed-off-by: Ao Tang <[email protected]>

…ay-video-clip-extraction Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

…age integration Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

- Introduced `ClipWriterStage` for writing clips and metadata during video processing. - Updated `video_split_clip_example.py` to include the new stage, allowing for clip writing functionality. - Enhanced command-line argument parsing for output clip path. - Added utility functions for managing storage paths and writing data in various formats. - Implemented unit tests for `ClipWriterStage` to ensure functionality and reliability. Signed-off-by: Ao Tang <[email protected]>

- Improved `ClipWriterStage` to support writing additional metadata during video processing. - Updated related utility functions to accommodate new metadata fields. - Refined unit tests to cover the new functionality and ensure reliability. Signed-off-by: Ao Tang <[email protected]>

- Integrated `MotionVectorDecodeStage` and `MotionFilterStage` into the video splitting pipeline for enhanced motion analysis. - Updated command-line arguments to configure motion filtering options, including GPU memory allocation and thresholds. - Modified `Clip` class to include a type for decoded motion data. Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

- Integrated `ClipFrameExtractionStage` into the video splitting pipeline to support frame extraction at specified rates based on user-defined aesthetics and embeddings. - Updated command-line argument handling to accommodate new frame extraction configurations. - Added logic to determine target frame rates based on the presence of aesthetics and embeddings. Signed-off-by: Ao Tang <[email protected]>

…d add unit tests Signed-off-by: Ao Tang <[email protected]>

- Introduced `VideoFrameExtractionStage` to enhance the video splitting pipeline with additional frame extraction capabilities. - Updated the `create_video_splitting_pipeline` function to accommodate the new stage based on user-defined splitting algorithms. - Added logic for handling the "transnetv2" splitting algorithm. Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

…ution timing - Added command-line arguments for TransNetV2 parameters, including thresholds, minimum and maximum clip lengths, cropping size, and GPU memory allocation. - Implemented execution time tracking in the video splitting pipeline to provide performance insights. - Updated `VideoFrameExtractionStage` to conditionally import `PyNvcFrameExtractor` based on GPU availability. Signed-off-by: Ao Tang <[email protected]>

…mple Signed-off-by: Ao Tang <[email protected]>

- Replaced the `FixedStrideExtractorStage` with `TransNetV2ClipExtractionStage` in the video split clip example to enhance clip extraction capabilities using the TransNetV2 model. - Updated command-line arguments to include parameters specific to the TransNetV2 stage, such as thresholds and clip length settings. - Introduced a new models package and added the TransNetV2 model implementation for shot transition detection. Signed-off-by: Ao Tang <[email protected]>

copy-pr-bot · 2025-07-16T15:50:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

- Updated the video splitting pipeline to use `video_dir` instead of `video_folder` for input video paths, improving clarity in command-line arguments. - Added `model_dir` argument to support model directory specification for the TransNetV2 model. - Modified `TransNetV2ClipExtractionStage` to accept `model_dir` during initialization, ensuring proper model loading. - Introduced unit tests for the TransNetV2 model classes to validate functionality and integration. Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

…tion' into aot/ray-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

Signed-off-by: Ao Tang <[email protected]>

github-actions · 2025-08-14T02:16:06Z

This PR was closed because it has been inactive for 7 days since being marked as stale.

abhinavg4 · 2025-08-20T02:20:12Z

ray-curator/ray_curator/examples/video/video_split_clip_example.py

@@ -151,7 +156,8 @@ def main(args: argparse.Namespace) -> None:
 if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    # General arguments
-    parser.add_argument("--video-folder", type=str, default="/home/aot/Videos")
+    parser.add_argument("--video-dir", type=str, required=True, help="Path to input video directory")


This can be an s3 path as well?

abhinavg4

Looks good except that I don't understand why we have TransnetV2 code inside our repo.

abhinavg4 · 2025-08-20T02:20:44Z

ray-curator/ray_curator/examples/video/video_split_clip_example.py

@@ -151,7 +156,8 @@ def main(args: argparse.Namespace) -> None:
 if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    # General arguments
-    parser.add_argument("--video-folder", type=str, default="/home/aot/Videos")
+    parser.add_argument("--video-dir", type=str, required=True, help="Path to input video directory")
+    parser.add_argument("--model-dir", type=str, required=True, help="Path to model directory")


Why is this required? We would download if they don't provide this? Or do we have similar interface as cosmos curate where we first download in a seperate step?

For now we required to have the model predownloaded in order to make all the directory mapping works. We can refactor this later for setup_on_node

abhinavg4 · 2025-08-20T02:23:41Z

ray-curator/ray_curator/models/transnetv2.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+r"""Model for fast shot transition detection.


Is there any reason we don't use this:

https://github.com/soCzech/TransNetV2/tree/master/inference-pytorch ?

…-video-video-frame-extraction Signed-off-by: Ao Tang <[email protected]>

… the codebase. This includes the deletion of `VideoDownloadStage`, `VideoReaderDownloadStage`, `GenericClipWriterStage`, and associated test files, enhancing maintainability and clarity in the video processing pipeline. Signed-off-by: Ao Tang <[email protected]>

…e codebase. Introduce `storage_utils.py` with utility functions for path handling, enhancing maintainability. This update improves clarity and reduces redundancy in the project structure. Signed-off-by: Ao Tang <[email protected]>

…the project configuration and enhance maintainability. Signed-off-by: Ao Tang <[email protected]>

- Introduced `VideoFrameExtractionStage` to enhance video frame extraction capabilities based on the specified `transnetv2` algorithm. - Updated `create_video_splitting_pipeline` in `video_split_clip_example.py` to include the new stage, allowing for improved video processing. - Refactored resource management in `VideoFrameExtractionStage` to dynamically allocate resources based on the decoder mode. - Cleaned up imports in `video_frame_extraction.py` for better organization. This update enhances the flexibility and performance of the video processing pipeline, enabling more sophisticated video analysis and manipulation capabilities. Signed-off-by: Ao Tang <[email protected]>

- Improved the `video_frame_extraction.py` and `nvcodec_utils.py` files by adding missing imports and cleaning up the code structure for better readability. - Updated the `test_video_frame_extraction.py` to enhance logging assertions and ensure consistent formatting in test cases. - Refactored test methods to improve clarity and maintainability, including adjustments to mock logger calls and metadata handling. These changes enhance the overall quality and maintainability of the video processing pipeline, ensuring better logging and testing practices. Signed-off-by: Ao Tang <[email protected]>

…DIA-NeMo/Curator into aot/ray-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

- Updated `video_split_clip_example.py` to replace `video_folder` with `video_dir` for improved clarity. - Added new command-line arguments for transnetv2 clip extraction, including thresholds, minimum and maximum clip lengths, cropping options, and GPU memory allocation. - Refactored `TransNetV2ClipExtractionStage` to streamline resource management by initializing resources in the `__post_init__` method. These changes improve the configurability and usability of the video processing pipeline, enabling more precise control over transnetv2 clip extraction settings. Signed-off-by: Ao Tang <[email protected]>

- Updated the `_TransNetV2` class in `transnetv2.py` to enhance the clarity of the `SDDCNN` initialization by removing unnecessary line breaks. - Cleaned up the `TransNetV2ClipExtractionStage` class in `transnetv2_extraction.py` by standardizing spacing and formatting for better readability. - Consolidated import statements in the test file `test_transnetv2_extraction.py` to streamline the code structure. These changes improve the maintainability and readability of the codebase, facilitating easier future modifications and enhancements. Signed-off-by: Ao Tang <[email protected]>

…e.py` to streamline the code and improve clarity. This change enhances maintainability by eliminating unnecessary abstractions in the model interface. Signed-off-by: Ao Tang <[email protected]>

…to streamline the model interface and improve code clarity. This change enhances maintainability by eliminating unnecessary properties that are no longer used. Signed-off-by: Ao Tang <[email protected]>

suiyoubi · 2025-08-20T20:31:15Z

/ok to test 9294748

…-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

suiyoubi · 2025-08-20T20:33:42Z

/ok to test 62eb1dc

suiyoubi added 25 commits July 9, 2025 07:40

Add video io reader

a0f3143

Add test

8261ccb

remove debug test

5108d2c

Signed-off-by: Ao Tang <[email protected]>

Add VideoReaderStage to video reading pipeline and update VideoDownlo…

0fde549

…adStage to accept VideoTask. Enhance video reading capabilities with new tests for VideoReaderStage. Signed-off-by: Ao Tang <[email protected]>

Update VideoDownloadStage to support verbose logging and modify video…

440992d

…_read_example to include verbose argument. Signed-off-by: Ao Tang <[email protected]>

Update outputs for VideoDownloadStage and VideoReaderStage to include…

6b69764

… additional metadata fields. Signed-off-by: Ao Tang <[email protected]>

Update CI workflow to include video dependencies for testing

4f85180

Signed-off-by: Ao Tang <[email protected]>

Merge remote-tracking branch 'origin/aot/ray-video-reader' into aot/r…

6452e7d

…ay-video-clip-extraction Signed-off-by: Ao Tang <[email protected]>

Add unit tests for grouping utilities

95c519a

Signed-off-by: Ao Tang <[email protected]>

Refactor video splitting pipeline to remove debug mode and enhance st…

fa8915b

…age integration Signed-off-by: Ao Tang <[email protected]>

Add video limit argument to video split clip example

602e27e

Signed-off-by: Ao Tang <[email protected]>

Add unit tests for video filtering stages

0cc8a8a

Signed-off-by: Ao Tang <[email protected]>

Enhance ClipFrameExtractionStage with improved target FPS handling an…

42df496

…d add unit tests Signed-off-by: Ao Tang <[email protected]>

Add unit tests for video frame extraction and nvcodec utilities

e7e5413

Signed-off-by: Ao Tang <[email protected]>

Remove TransNetV2ClipExtractionStage import from video split clip exa…

420b31e

…mple Signed-off-by: Ao Tang <[email protected]>

suiyoubi added 4 commits July 16, 2025 10:12

fix test

632bcc7

Signed-off-by: Ao Tang <[email protected]>

Merge remote-tracking branch 'origin/aot/ray-video-video-frame-extrac…

c7e4990

…tion' into aot/ray-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

Refactor motion filter batch size parameter in video processing

2d17101

Signed-off-by: Ao Tang <[email protected]>

github-actions bot added the Stale label Aug 6, 2025

github-actions bot closed this Aug 14, 2025

suiyoubi removed the Stale label Aug 14, 2025

suiyoubi reopened this Aug 14, 2025

abhinavg4 reviewed Aug 20, 2025

View reviewed changes

abhinavg4 approved these changes Aug 20, 2025

View reviewed changes

suiyoubi added 11 commits August 20, 2025 12:54

Merge branch 'ray-api' of github.com:NVIDIA-NeMo/Curator into aot/ray…

e1350ee

…-video-video-frame-extraction Signed-off-by: Ao Tang <[email protected]>

Remove unused video dependencies from pyproject.toml to streamline …

ad63332

…the project configuration and enhance maintainability. Signed-off-by: Ao Tang <[email protected]>

Merge branch 'aot/ray-video-video-frame-extraction' of github.com:NVI…

6b85021

…DIA-NeMo/Curator into aot/ray-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

Remove unused conda_env_name property from ModelInterface in `bas…

e8d2651

…e.py` to streamline the code and improve clarity. This change enhances maintainability by eliminating unnecessary abstractions in the model interface. Signed-off-by: Ao Tang <[email protected]>

Remove conda_env_name property and related tests from TransNetV2 …

9294748

…to streamline the model interface and improve code clarity. This change enhances maintainability by eliminating unnecessary properties that are no longer used. Signed-off-by: Ao Tang <[email protected]>

copy-pr-bot bot temporarily deployed to test August 20, 2025 20:31 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci August 20, 2025 20:31 Error

Merge branch 'ray-api' of github.com:NVIDIA-NeMo/Curator into aot/ray…

62eb1dc

…-video-transnet-clip-extraction Signed-off-by: Ao Tang <[email protected]>

suiyoubi changed the base branch from aot/ray-video-video-frame-extraction to ray-api August 20, 2025 20:33

copy-pr-bot bot temporarily deployed to test August 20, 2025 20:34 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci August 20, 2025 20:34 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci August 20, 2025 20:43 Inactive

suiyoubi merged commit abaf08b into ray-api Aug 20, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add TransNetV2ClipExtractionStage to video splitting pipeline #809

Add TransNetV2ClipExtractionStage to video splitting pipeline #809

Uh oh!

suiyoubi commented Jul 16, 2025

Uh oh!

copy-pr-bot bot commented Jul 16, 2025

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

abhinavg4 Aug 20, 2025

Uh oh!

suiyoubi Aug 20, 2025

Uh oh!

abhinavg4 left a comment

Uh oh!

abhinavg4 Aug 20, 2025

Uh oh!

suiyoubi Aug 20, 2025

Uh oh!

abhinavg4 Aug 20, 2025

Uh oh!

suiyoubi commented Aug 20, 2025

Uh oh!

suiyoubi commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

Add TransNetV2ClipExtractionStage to video splitting pipeline #809

Add TransNetV2ClipExtractionStage to video splitting pipeline #809

Uh oh!

Conversation

suiyoubi commented Jul 16, 2025

Description

Usage

Checklist

Uh oh!

copy-pr-bot bot commented Jul 16, 2025

Uh oh!

github-actions bot commented Aug 14, 2025

Uh oh!

abhinavg4 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

suiyoubi Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

abhinavg4 left a comment

Choose a reason for hiding this comment

Uh oh!

abhinavg4 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

suiyoubi Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

abhinavg4 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

suiyoubi commented Aug 20, 2025

Uh oh!

suiyoubi commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!