You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.
Objectives
Design a general mechanism to uniformly present progress in Fluid's data operation CRDs. This mechanism should be similar to Argo's Self-Reporting Progress, where users specify the progress updates in a file within the container, and Fluid's controller updates the status in the CRD.
Implement a proof-of-concept solution using DataProcess.
Feature Requirements
Progress Reporting Mechanism:
Each data operation task should be capable of generating progress reports during execution.
The progress reports should follow an N/M format, where N is the completed amount of work and M is the total amount of work.
Environment Variable Configuration:
Define an environment variable FLUID_PROGRESS_FILE, which specifies the location of the progress report file.
Progress Report File:
The data operation task must periodically update the FLUID_PROGRESS_FILE during execution to report the current progress.
Executor Reading Mechanism:
The executor should periodically (e.g., every 3 seconds) check the FLUID_PROGRESS_FILE to get the latest progress information.
Progress Annotation:
Upon task initiation, the task's metadata should set an initial progress annotation, such as fluid.io/data-progress: 0/100.
Progress Update:
If the FLUID_PROGRESS_FILE is updated, the executor should update the task's annotation to reflect the latest progress.
Progress Display:
The monitoring system should be able to read the task's annotations and display the real-time progress of each data operation task on the user interface.
Example Code
apiVersion: fluid.io/v1alpha1kind: DataProcessmetadata:
name: train-flow-step1spec:
dataset:
name: jfsdemonamespace: defaultprocessor:
metadata:
annotations:
fluid.io/data-progress: 0/100script:
image: nginximageTag: latestcommand: ["bash"]script: | for i in $(seq 1 10); do sleep 10 echo "$(($i*10))/100" > $FLUID_PROGRESS_FILE done
This example provides a basic framework.
Design a self-reporting progress feature for Fluid and provide a design document following the Fluid Design Workflow
Implement a proof-of-concept solution using DataProcess
Add documentation and a demo to support the usage of self-reporting progress for DataProcess in Fluid
The text was updated successfully, but these errors were encountered:
Background
In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.
Objectives
Feature Requirements
Progress Reporting Mechanism:
N/M
format, whereN
is the completed amount of work andM
is the total amount of work.Environment Variable Configuration:
FLUID_PROGRESS_FILE
, which specifies the location of the progress report file.Progress Report File:
FLUID_PROGRESS_FILE
during execution to report the current progress.Executor Reading Mechanism:
FLUID_PROGRESS_FILE
to get the latest progress information.Progress Annotation:
fluid.io/data-progress: 0/100
.Progress Update:
FLUID_PROGRESS_FILE
is updated, the executor should update the task's annotation to reflect the latest progress.Progress Display:
Example Code
This example provides a basic framework.
The text was updated successfully, but these errors were encountered: