[Data] Simplify and consolidate progress bar outputs #47692

scottjlee · 2024-09-16T19:57:51Z

Why are these changes needed?

Currently, the progress bar is pretty verbose because it is very information dense. This PR:

Reduces the output clutter by using emojis to represent some concepts
Standardizes common text used in multiple progress bar outputs
Adds labels within each progress bar to clarify meaning

Progress bar before this PR:

Progress bar after this PR:

Will follow up with a docs PR once we merge this change, so that I don't need to continuously modify the docs.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Scott Lee <[email protected]>

raulchen

Can we print a message in the beginning explaining all the legends?
The resource usage section is still very lengthy, do you also plan to simplify it?

raulchen · 2024-09-16T20:31:41Z

python/ray/data/_internal/execution/streaming_executor_state.py

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
    def summary_str(self, resource_manager: ResourceManager) -> str:
        queued = self.num_queued() + self.op.internal_queue_size()
        active = self.op.num_active_tasks()
-        desc = f"- {self.op.name}: {active} active, {queued} queued"
+        desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"


"N queued" actually means N blocks in the input buffer, not number of tasks.
(the previous was already a bit confusing)

also since we have a "Tasks: " section here. I'm wondering maybe we can also move the actor info after this. Instead of at the very end.

maybe format it as "Tasks ..., Actors ..., Input blocks ..."

also, and blocked sign can be part of "tasks"

maybe format it as "Tasks ..., Actors ..., Input blocks ..."

+1 something like this seems reasonable to me

raulchen · 2024-09-16T20:35:03Z

python/ray/data/_internal/execution/streaming_executor.py

-            f"{limits.object_store_memory_str()} object_store_memory "
-            "(pending: "
+            f"{limits.object_store_memory_str()} object store "
+            "(⏳: "


can we hide the pending section if all are 0? (forgot to mention in the previous PR)

bveeramani · 2024-09-18T08:25:53Z

python/ray/data/_internal/execution/streaming_executor_state.py

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
    def summary_str(self, resource_manager: ResourceManager) -> str:
        queued = self.num_queued() + self.op.internal_queue_size()
        active = self.op.num_active_tasks()
-        desc = f"- {self.op.name}: {active} active, {queued} queued"
+        desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"


What do the green and yellow circles represent?

green = active, yellow = queued

although after reworking the progress bar as suggested above, i have removed the green/yellow emoji

bveeramani · 2024-09-18T08:28:51Z

python/ray/data/_internal/execution/streaming_executor_state.py

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
    def summary_str(self, resource_manager: ResourceManager) -> str:
        queued = self.num_queued() + self.op.internal_queue_size()
        active = self.op.num_active_tasks()
-        desc = f"- {self.op.name}: {active} active, {queued} queued"
+        desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"


maybe format it as "Tasks ..., Actors ..., Input blocks ..."

+1 something like this seems reasonable to me

Signed-off-by: Scott Lee <[email protected]>

scottjlee · 2024-09-18T17:31:44Z

Here is the updated progress bar after addressing initial comments:

The information is better grouped now, but unfortunately it's not any less verbose. I couldn't come up with any intuitive emojis to represent the concepts: CPU, GPU, object store, tasks, actors, and input blocks. Any suggestions here?
@raulchen @bveeramani

These are the best that I/ChatGPT could come up with:

CPU: 🖥️
GPU: 🖼️
object store: 📦
tasks: 📋
actors: 🎭 or 👤
(Queued) input blocks: ⏳

raulchen · 2024-09-18T17:42:44Z

I couldn't come up with any intuitive emojis to represent the concepts: CPU, GPU, object store, tasks, actors, and input blocks. Any suggestions here?

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

bveeramani · 2024-09-18T18:47:01Z

Any suggestions here?

IMO we should err on the side of clarity over conciseness.

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

Something like this? C, G, O.S, T, A, I.B?

As a general design principle, I'd be cautious to rely on tooltips or extra descriptions to make something understandable.

scottjlee · 2024-09-18T19:02:39Z

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

Something like this? C, G, O.S, T, A, I.B?

Yes, that's what i was thinking as well.

As a general design principle, I'd be cautious to rely on tooltips or extra descriptions to make something understandable.

When I was discussing with @omatthew98 offline, our thought was that the progress bar should be as concise as possible, but intuitive enough that the user should only need to look at docs once to understand/remember how to use it. We could print a message linking to docs at the beginning of dataset execution once. Does that sound reasonable?

bveeramani · 2024-09-18T19:15:40Z

When I was discussing with @omatthew98 offline, our thought was that the progress bar should be as concise as possible, but intuitive enough that the user should only need to look at docs once to understand/remember how to use it. We could print a message linking to docs at the beginning of dataset execution once. Does that sound reasonable?

My preference is still to err on the side of clarity. IMO we shouldn't make reading documentation a pre-requisite to interpreting a user interface, even if you'd only need to read the documentation once. Also, I think single letters would be confusing (e.g., does "G" represent "GPU" or "Giga"?)

rework outputs

4db8321

Signed-off-by: Scott Lee <[email protected]>

scottjlee assigned raulchen and bveeramani Sep 16, 2024

raulchen reviewed Sep 16, 2024

View reviewed changes

bveeramani reviewed Sep 18, 2024

View reviewed changes

scottjlee added 2 commits September 18, 2024 10:18

Merge branch 'master' into 0916-progbar-emoji

bc918f3

Signed-off-by: Scott Lee <[email protected]>

comments

3e3b70d

Signed-off-by: Scott Lee <[email protected]>

scottjlee marked this pull request as ready for review September 18, 2024 17:30

scottjlee requested review from ericl, scv119, c21, amogkam, stephanie-wang and omatthew98 as code owners September 18, 2024 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Simplify and consolidate progress bar outputs #47692

[Data] Simplify and consolidate progress bar outputs #47692

scottjlee commented Sep 16, 2024 •

edited

Loading

raulchen left a comment

raulchen Sep 16, 2024

raulchen Sep 16, 2024

raulchen Sep 16, 2024

raulchen Sep 16, 2024

bveeramani Sep 18, 2024

raulchen Sep 16, 2024

bveeramani Sep 18, 2024

scottjlee Sep 18, 2024

scottjlee Sep 18, 2024

bveeramani Sep 18, 2024

scottjlee commented Sep 18, 2024 •

edited

Loading

raulchen commented Sep 18, 2024

bveeramani commented Sep 18, 2024

scottjlee commented Sep 18, 2024 •

edited

Loading

bveeramani commented Sep 18, 2024

[Data] Simplify and consolidate progress bar outputs #47692

Are you sure you want to change the base?

[Data] Simplify and consolidate progress bar outputs #47692

Conversation

scottjlee commented Sep 16, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

raulchen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottjlee commented Sep 18, 2024 • edited Loading

raulchen commented Sep 18, 2024

bveeramani commented Sep 18, 2024

scottjlee commented Sep 18, 2024 • edited Loading

bveeramani commented Sep 18, 2024

scottjlee commented Sep 16, 2024 •

edited

Loading

scottjlee commented Sep 18, 2024 •

edited

Loading

scottjlee commented Sep 18, 2024 •

edited

Loading