-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add progress bar on evaluate #1871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add progress bar on evaluate #1871
Conversation
| eval_span.set_attribute('averages', report.averages()) | ||
|
|
||
| if progress: # pragma: no branch | ||
| progress_bar.close() # pyright: ignore[reportPossiblyUnboundVariable] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the need for these pyright ignores by checking for progress_bar instead? You'll want to set it to None above if progress: up top, so it always has a value.
| limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() | ||
| if progress: # pragma: no branch | ||
| progress_bar = tqdm(total=total_cases, desc=f'Evaluating {name}') | ||
| lock = asyncio.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this progress_lock
| return await _run_task_and_evaluators(task, case, report_case_name, self.evaluators) | ||
| result = await _run_task_and_evaluators(task, case, report_case_name, self.evaluators) | ||
| if progress: # pragma: no branch | ||
| async with lock: # pyright: ignore[reportPossiblyUnboundVariable, reportGeneralTypeIssues] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see if we can get rid of these pyright ignores.
What do you think about creating a new update_progress_bar variable up above, and setting it to a function that does this lock and update when progress is enabled, or None if it's not, and only calling it when it's set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this could work!
To keep it simpler I followed your other suggestion: I set progress_bar and progress_lock only if progress is true, and None otherwise.
This way I can check directly if they're set with if progress_bar and progress_lock: removing the need for the ignores.
Let me know if you'd still prefer to have a function perform the update.
| from pydantic._internal import _typing_extra | ||
| from pydantic_core import to_json | ||
| from pydantic_core.core_schema import SerializationInfo, SerializerFunctionWrapHandler | ||
| from tqdm import tqdm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update this to use the Rich progress bar from https://rich.readthedocs.io/en/stable/progress.html instead? That should already be a dependency as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll work on this!
|
@davide-andreoli Are you still interested in helping here? |
|
Hi @Kludex, yes I'm still interested. |
|
Hi @Kludex, everything should have been resolved in the latest commit! Let me know if there's anything else that could be improved. |
|
There's already a lock inside the update, so in terms of PydanticAI those operations are atomic. |
Added progress bar on evaluate
Hi, following #1657 I've added a progress bar on the evaluate process.
Since
tqdmis already a dependency no dependency needs to be added.I went for manual updates inside the
_handle_casefunction for better compatibility with the rest of the code.No styling has been applied to the default tqdm bar, let me know if you'd prefer it to be styled in any way.