Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Simplify and consolidate progress bar outputs #47692

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

scottjlee
Copy link
Contributor

@scottjlee scottjlee commented Sep 16, 2024

Why are these changes needed?

Currently, the progress bar is pretty verbose because it is very information dense. This PR:

  • Reduces the output clutter by using emojis to represent some concepts
  • Standardizes common text used in multiple progress bar outputs
  • Adds labels within each progress bar to clarify meaning

Progress bar before this PR:
Screenshot at Sep 16 13-00-17

Progress bar after this PR:
Screenshot at Sep 18 10-28-01

Will follow up with a docs PR once we merge this change, so that I don't need to continuously modify the docs.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Scott Lee <[email protected]>
Copy link
Contributor

@raulchen raulchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we print a message in the beginning explaining all the legends?
  • The resource usage section is still very lengthy, do you also plan to simplify it?

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
def summary_str(self, resource_manager: ResourceManager) -> str:
queued = self.num_queued() + self.op.internal_queue_size()
active = self.op.num_active_tasks()
desc = f"- {self.op.name}: {active} active, {queued} queued"
desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"N queued" actually means N blocks in the input buffer, not number of tasks.
(the previous was already a bit confusing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also since we have a "Tasks: " section here. I'm wondering maybe we can also move the actor info after this. Instead of at the very end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe format it as "Tasks ..., Actors ..., Input blocks ..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, and blocked sign can be part of "tasks"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe format it as "Tasks ..., Actors ..., Input blocks ..."

+1 something like this seems reasonable to me

f"{limits.object_store_memory_str()} object_store_memory "
"(pending: "
f"{limits.object_store_memory_str()} object store "
"(: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we hide the pending section if all are 0? (forgot to mention in the previous PR)

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
def summary_str(self, resource_manager: ResourceManager) -> str:
queued = self.num_queued() + self.op.internal_queue_size()
active = self.op.num_active_tasks()
desc = f"- {self.op.name}: {active} active, {queued} queued"
desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do the green and yellow circles represent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

green = active, yellow = queued

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although after reworking the progress bar as suggested above, i have removed the green/yellow emoji

@@ -259,16 +259,16 @@ def refresh_progress_bar(self, resource_manager: ResourceManager) -> None:
def summary_str(self, resource_manager: ResourceManager) -> str:
queued = self.num_queued() + self.op.internal_queue_size()
active = self.op.num_active_tasks()
desc = f"- {self.op.name}: {active} active, {queued} queued"
desc = f"- {self.op.name}. Tasks: {active} 🟢, {queued} 🟡"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe format it as "Tasks ..., Actors ..., Input blocks ..."

+1 something like this seems reasonable to me

@scottjlee scottjlee marked this pull request as ready for review September 18, 2024 17:30
@scottjlee
Copy link
Contributor Author

scottjlee commented Sep 18, 2024

Here is the updated progress bar after addressing initial comments:
Screenshot at Sep 18 10-28-01

The information is better grouped now, but unfortunately it's not any less verbose. I couldn't come up with any intuitive emojis to represent the concepts: CPU, GPU, object store, tasks, actors, and input blocks. Any suggestions here?
@raulchen @bveeramani

These are the best that I/ChatGPT could come up with:

  • CPU: 🖥️
  • GPU: 🖼️
  • object store: 📦
  • tasks: 📋
  • actors: 🎭 or 👤
  • (Queued) input blocks: ⏳

@raulchen
Copy link
Contributor

I couldn't come up with any intuitive emojis to represent the concepts: CPU, GPU, object store, tasks, actors, and input blocks. Any suggestions here?

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

@bveeramani
Copy link
Member

Any suggestions here?

IMO we should err on the side of clarity over conciseness.

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

Something like this? C, G, O.S, T, A, I.B?

As a general design principle, I'd be cautious to rely on tooltips or extra descriptions to make something understandable.

@scottjlee
Copy link
Contributor Author

scottjlee commented Sep 18, 2024

Maybe just use the initial letters? Seems fine as long as we print a message explaining them.

Something like this? C, G, O.S, T, A, I.B?

Yes, that's what i was thinking as well.

As a general design principle, I'd be cautious to rely on tooltips or extra descriptions to make something understandable.

When I was discussing with @omatthew98 offline, our thought was that the progress bar should be as concise as possible, but intuitive enough that the user should only need to look at docs once to understand/remember how to use it. We could print a message linking to docs at the beginning of dataset execution once. Does that sound reasonable?

@bveeramani
Copy link
Member

When I was discussing with @omatthew98 offline, our thought was that the progress bar should be as concise as possible, but intuitive enough that the user should only need to look at docs once to understand/remember how to use it. We could print a message linking to docs at the beginning of dataset execution once. Does that sound reasonable?

My preference is still to err on the side of clarity. IMO we shouldn't make reading documentation a pre-requisite to interpreting a user interface, even if you'd only need to read the documentation once. Also, I think single letters would be confusing (e.g., does "G" represent "GPU" or "Giga"?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants