Skip to content

Conversation

suiyoubi
Copy link
Contributor

  • Replaced the FixedStrideExtractorStage with TransNetV2ClipExtractionStage in the video split clip example to enhance clip extraction capabilities using the TransNetV2 model.
  • Updated command-line arguments to include parameters specific to the TransNetV2 stage, such as thresholds and clip length settings.
  • Introduced a new models package and added the TransNetV2 model implementation for shot transition detection.

Description

Usage

# Add snippet demonstrating usage

Checklist

  • I am familiar with the Contributing Guide.
  • New or Existing tests cover these changes.
  • The documentation is up to date with these changes.

suiyoubi added 25 commits July 9, 2025 07:40
…ding stages

- Introduced `video_split_clip_example.py` to demonstrate video splitting functionality.
- Added `ClipTranscodingStage` and `FixedStrideExtractorStage` for processing video clips.
- Implemented command-line arguments for configuring video processing parameters.
- Created utility functions for grouping iterables in `grouping.py`.
- Added unit tests for the new stages in `test_clip_transcoding_stage.py` and `test_fixed_stride_extractor_stage.py`.

Signed-off-by: Ao Tang <[email protected]>
Signed-off-by: Ao Tang <[email protected]>
…adStage to accept VideoTask. Enhance video reading capabilities with new tests for VideoReaderStage.

Signed-off-by: Ao Tang <[email protected]>
…_read_example to include verbose argument.

Signed-off-by: Ao Tang <[email protected]>
- Introduced a new test package for tasks with an initial test suite for the video tasks module, including tests for the Clip, ClipStats, Video, VideoMetadata, and VideoTask classes.
- Implemented various test cases to validate initialization, property calculations, metadata extraction, and size calculations.

This enhances the testing coverage for video-related functionalities in the ray-curator project.

Signed-off-by: Ao Tang <[email protected]>
- Expanded the test suite for the video tasks module by adding new test cases for the Clip, ClipStats, Video, VideoMetadata, and VideoTask classes.
- Improved coverage for various functionalities including initialization, property calculations, and metadata extraction.

This update strengthens the reliability of video-related features in the ray-curator project.

Signed-off-by: Ao Tang <[email protected]>
- Introduced `ClipWriterStage` for writing clips and metadata during video processing.
- Updated `video_split_clip_example.py` to include the new stage, allowing for clip writing functionality.
- Enhanced command-line argument parsing for output clip path.
- Added utility functions for managing storage paths and writing data in various formats.
- Implemented unit tests for `ClipWriterStage` to ensure functionality and reliability.

Signed-off-by: Ao Tang <[email protected]>
- Improved `ClipWriterStage` to support writing additional metadata during video processing.
- Updated related utility functions to accommodate new metadata fields.
- Refined unit tests to cover the new functionality and ensure reliability.

Signed-off-by: Ao Tang <[email protected]>
- Integrated `MotionVectorDecodeStage` and `MotionFilterStage` into the video splitting pipeline for enhanced motion analysis.
- Updated command-line arguments to configure motion filtering options, including GPU memory allocation and thresholds.
- Modified `Clip` class to include a type for decoded motion data.

Signed-off-by: Ao Tang <[email protected]>
- Integrated `ClipFrameExtractionStage` into the video splitting pipeline to support frame extraction at specified rates based on user-defined aesthetics and embeddings.
- Updated command-line argument handling to accommodate new frame extraction configurations.
- Added logic to determine target frame rates based on the presence of aesthetics and embeddings.

Signed-off-by: Ao Tang <[email protected]>
- Introduced `VideoFrameExtractionStage` to enhance the video splitting pipeline with additional frame extraction capabilities.
- Updated the `create_video_splitting_pipeline` function to accommodate the new stage based on user-defined splitting algorithms.
- Added logic for handling the "transnetv2" splitting algorithm.

Signed-off-by: Ao Tang <[email protected]>
…ution timing

- Added command-line arguments for TransNetV2 parameters, including thresholds, minimum and maximum clip lengths, cropping size, and GPU memory allocation.
- Implemented execution time tracking in the video splitting pipeline to provide performance insights.
- Updated `VideoFrameExtractionStage` to conditionally import `PyNvcFrameExtractor` based on GPU availability.

Signed-off-by: Ao Tang <[email protected]>
- Replaced the `FixedStrideExtractorStage` with `TransNetV2ClipExtractionStage` in the video split clip example to enhance clip extraction capabilities using the TransNetV2 model.
- Updated command-line arguments to include parameters specific to the TransNetV2 stage, such as thresholds and clip length settings.
- Introduced a new models package and added the TransNetV2 model implementation for shot transition detection.

Signed-off-by: Ao Tang <[email protected]>
Copy link

copy-pr-bot bot commented Jul 16, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

suiyoubi added 4 commits July 16, 2025 10:12
- Updated the video splitting pipeline to use `video_dir` instead of `video_folder` for input video paths, improving clarity in command-line arguments.
- Added `model_dir` argument to support model directory specification for the TransNetV2 model.
- Modified `TransNetV2ClipExtractionStage` to accept `model_dir` during initialization, ensuring proper model loading.
- Introduced unit tests for the TransNetV2 model classes to validate functionality and integration.

Signed-off-by: Ao Tang <[email protected]>
Signed-off-by: Ao Tang <[email protected]>
…tion' into aot/ray-video-transnet-clip-extraction

Signed-off-by: Ao Tang <[email protected]>
@github-actions github-actions bot added the Stale label Aug 6, 2025
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Aug 14, 2025
@suiyoubi suiyoubi removed the Stale label Aug 14, 2025
@suiyoubi suiyoubi reopened this Aug 14, 2025
@@ -151,7 +156,8 @@ def main(args: argparse.Namespace) -> None:
if __name__ == "__main__":
parser = argparse.ArgumentParser()
# General arguments
parser.add_argument("--video-folder", type=str, default="/home/aot/Videos")
parser.add_argument("--video-dir", type=str, required=True, help="Path to input video directory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an s3 path as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

@abhinavg4 abhinavg4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except that I don't understand why we have TransnetV2 code inside our repo.

@@ -151,7 +156,8 @@ def main(args: argparse.Namespace) -> None:
if __name__ == "__main__":
parser = argparse.ArgumentParser()
# General arguments
parser.add_argument("--video-folder", type=str, default="/home/aot/Videos")
parser.add_argument("--video-dir", type=str, required=True, help="Path to input video directory")
parser.add_argument("--model-dir", type=str, required=True, help="Path to model directory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required? We would download if they don't provide this? Or do we have similar interface as cosmos curate where we first download in a seperate step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now we required to have the model predownloaded in order to make all the directory mapping works. We can refactor this later for setup_on_node

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
r"""Model for fast shot transition detection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… the codebase. This includes the deletion of `VideoDownloadStage`, `VideoReaderDownloadStage`, `GenericClipWriterStage`, and associated test files, enhancing maintainability and clarity in the video processing pipeline.

Signed-off-by: Ao Tang <[email protected]>
…e codebase. Introduce `storage_utils.py` with utility functions for path handling, enhancing maintainability. This update improves clarity and reduces redundancy in the project structure.

Signed-off-by: Ao Tang <[email protected]>
…the project configuration and enhance maintainability.

Signed-off-by: Ao Tang <[email protected]>
- Introduced `VideoFrameExtractionStage` to enhance video frame extraction capabilities based on the specified `transnetv2` algorithm.
- Updated `create_video_splitting_pipeline` in `video_split_clip_example.py` to include the new stage, allowing for improved video processing.
- Refactored resource management in `VideoFrameExtractionStage` to dynamically allocate resources based on the decoder mode.
- Cleaned up imports in `video_frame_extraction.py` for better organization.

This update enhances the flexibility and performance of the video processing pipeline, enabling more sophisticated video analysis and manipulation capabilities.

Signed-off-by: Ao Tang <[email protected]>
- Improved the `video_frame_extraction.py` and `nvcodec_utils.py` files by adding missing imports and cleaning up the code structure for better readability.
- Updated the `test_video_frame_extraction.py` to enhance logging assertions and ensure consistent formatting in test cases.
- Refactored test methods to improve clarity and maintainability, including adjustments to mock logger calls and metadata handling.

These changes enhance the overall quality and maintainability of the video processing pipeline, ensuring better logging and testing practices.

Signed-off-by: Ao Tang <[email protected]>
…DIA-NeMo/Curator into aot/ray-video-transnet-clip-extraction

Signed-off-by: Ao Tang <[email protected]>
- Updated `video_split_clip_example.py` to replace `video_folder` with `video_dir` for improved clarity.
- Added new command-line arguments for transnetv2 clip extraction, including thresholds, minimum and maximum clip lengths, cropping options, and GPU memory allocation.
- Refactored `TransNetV2ClipExtractionStage` to streamline resource management by initializing resources in the `__post_init__` method.

These changes improve the configurability and usability of the video processing pipeline, enabling more precise control over transnetv2 clip extraction settings.

Signed-off-by: Ao Tang <[email protected]>
- Updated the `_TransNetV2` class in `transnetv2.py` to enhance the clarity of the `SDDCNN` initialization by removing unnecessary line breaks.
- Cleaned up the `TransNetV2ClipExtractionStage` class in `transnetv2_extraction.py` by standardizing spacing and formatting for better readability.
- Consolidated import statements in the test file `test_transnetv2_extraction.py` to streamline the code structure.

These changes improve the maintainability and readability of the codebase, facilitating easier future modifications and enhancements.

Signed-off-by: Ao Tang <[email protected]>
…e.py` to streamline the code and improve clarity. This change enhances maintainability by eliminating unnecessary abstractions in the model interface.

Signed-off-by: Ao Tang <[email protected]>
…to streamline the model interface and improve code clarity. This change enhances maintainability by eliminating unnecessary properties that are no longer used.

Signed-off-by: Ao Tang <[email protected]>
@suiyoubi
Copy link
Contributor Author

/ok to test 9294748

…-video-transnet-clip-extraction

Signed-off-by: Ao Tang <[email protected]>
@suiyoubi suiyoubi changed the base branch from aot/ray-video-video-frame-extraction to ray-api August 20, 2025 20:33
@suiyoubi
Copy link
Contributor Author

/ok to test 62eb1dc

@suiyoubi suiyoubi merged commit abaf08b into ray-api Aug 20, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants