Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURES] Implement Self-Reporting Progress Feature to Enhance Fluid Data Operation Monitoring and Management #4192

Open
cheyang opened this issue Jul 3, 2024 · 0 comments
Labels
features features

Comments

@cheyang
Copy link
Collaborator

cheyang commented Jul 3, 2024

Background

In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.

Objectives

  1. Design a general mechanism to uniformly present progress in Fluid's data operation CRDs. This mechanism should be similar to Argo's Self-Reporting Progress, where users specify the progress updates in a file within the container, and Fluid's controller updates the status in the CRD.
  2. Implement a proof-of-concept solution using DataProcess.

Feature Requirements

  1. Progress Reporting Mechanism:

    • Each data operation task should be capable of generating progress reports during execution.
    • The progress reports should follow an N/M format, where N is the completed amount of work and M is the total amount of work.
  2. Environment Variable Configuration:

    • Define an environment variable FLUID_PROGRESS_FILE, which specifies the location of the progress report file.
  3. Progress Report File:

    • The data operation task must periodically update the FLUID_PROGRESS_FILE during execution to report the current progress.
  4. Executor Reading Mechanism:

    • The executor should periodically (e.g., every 3 seconds) check the FLUID_PROGRESS_FILE to get the latest progress information.
  5. Progress Annotation:

    • Upon task initiation, the task's metadata should set an initial progress annotation, such as fluid.io/data-progress: 0/100.
  6. Progress Update:

    • If the FLUID_PROGRESS_FILE is updated, the executor should update the task's annotation to reflect the latest progress.
  7. Progress Display:

    • The monitoring system should be able to read the task's annotations and display the real-time progress of each data operation task on the user interface.

Example Code

apiVersion: fluid.io/v1alpha1
kind: DataProcess
metadata:
  name: train-flow-step1
spec:
  dataset:
    name: jfsdemo
    namespace: default
  processor:
    metadata:
      annotations:
        fluid.io/data-progress: 0/100
    script:
      image: nginx
      imageTag: latest
      command: ["bash"]
      script: |
              for i in $(seq 1 10); do
                sleep 10
                echo "$(($i*10))/100" > $FLUID_PROGRESS_FILE
              done

This example provides a basic framework.

  • Design a self-reporting progress feature for Fluid and provide a design document following the Fluid Design Workflow
  • Implement a proof-of-concept solution using DataProcess
  • Add documentation and a demo to support the usage of self-reporting progress for DataProcess in Fluid
@cheyang cheyang added the features features label Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
features features
Projects
None yet
Development

No branches or pull requests

1 participant