Skip to content

Feature/scorep feedback #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 5, 2025
Merged

Feature/scorep feedback #42

merged 13 commits into from
Jun 5, 2025

Conversation

OutlyingWest
Copy link
Collaborator

Description of Changes

  1. Added and documented a logging system.
  2. Implemented animation for long-running processes in Score-P mode (both for data transfer to the subprocess and for displaying progress animation within the subprocess itself).
  3. Added an example of execution of the long running task - Large array processing with Score-P in examples/ExampleBasic.ipynb
  4. Changed the way subprocess output streams are read — moved this logic into a separate function read_scorep_process_pipe().
  5. Brought tests into a working state:
    • Created a new function to clean up garbage from standard output.
    • Added a context manager with self.subTest() to better identify which cell failed during execution.
    • Temporarily updated expected test outputs to account for a new line introduced for proper animation handling (clear_line string in read_scorep_process_pipe()).

…mprovements

1. Added and documented a logging system.
2. Implemented animation for long-running processes in Score-P mode (both for data transfer to the subprocess and for displaying progress animation within the subprocess itself).
3. Added an example of execution of the long running task – Large array processing with Score-P in `examples/ExampleBasic.ipynb`.
4. Changed the way subprocess output streams are read — moved this logic into a separate function `read_scorep_process_pipe()`.
5. Brought tests into a working state:
   - Created a new function to clean up garbage from standard output.
   - Added a context manager `with self.subTest()` to better identify which cell failed during execution.
   - Temporarily updated expected test outputs to account for a new line introduced for proper animation handling (`clear_line` string in `read_scorep_process_pipe()`).
@OutlyingWest OutlyingWest force-pushed the feature/scorep_feedback branch from 03754b2 to 1e56026 Compare May 7, 2025 11:27
Copy link
Collaborator

@elwer elwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For logging, I have the following recommendations:

  • Check whether the logging levels are appropriate. For example, on line 923 of kernel.py, the message '{os.environ["SCOREP_EXPERIMENT_DIRECTORY"]=}' should not be logged as a warning.
  • Check the consistency of the logging messages. For example, we have variations of the same message:
    'Cell execution failed, cell persistence was not recorded', 'Failed to load cell's persistence to the notebook', 'Failed to pickle notebook's persistence'. These sound relatively similar, although there are differences. Perhaps you could make them more meaningful by adding the context of the persistence step (e.g. Jupyter parent -> Scorep child). You could also add the type of serialisation (memory/disk, serialiser, etc.).
    What could help here is an enum to define the potential error messages and, for the future, hints on how to resolve them.

Comment on lines 969 to 1016

def read_scorep_process_pipe(self, proc: subprocess.Popen[bytes], stdout_lock: threading.Lock) -> list:
"""
Reads and processes the output of a subprocess running with Score-P instrumentation.
Args:
proc (subprocess.Popen[bytes]): The subprocess whose output is being read.
stdout_lock (threading.Lock): Lock to avoid output overlapping

Returns:
list: A list of decoded strings containing "MCM_TS" timestamps.
"""
multicellmode_timestamps = []
sel = selectors.DefaultSelector()

sel.register(proc.stdout, selectors.EVENT_READ)
sel.register(proc.stderr, selectors.EVENT_READ)

line_width = 50
clear_line = "\r" + " " * line_width + "\r"

while True:
# Select between stdout and stderr
for key, val in sel.select():
line = key.fileobj.readline()
if not line:
sel.unregister(key.fileobj)
continue

decoded_line = line.decode(sys.getdefaultencoding(), errors='ignore')

if key.fileobj is proc.stderr:
with stdout_lock:
self.log.warning(f'{decoded_line.strip()}')
elif 'MCM_TS' in decoded_line:
multicellmode_timestamps.append(decoded_line)
else:
with stdout_lock:
sys.stdout.write(clear_line)
sys.stdout.flush()
self.cell_output(decoded_line)

# If both stdout and stderr empty -> out of loop
if not sel.get_map():
break

return multicellmode_timestamps


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a bit of documentation here

Copy link
Collaborator Author

@OutlyingWest OutlyingWest Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation extended in 26c9b81

@OutlyingWest
Copy link
Collaborator Author

OutlyingWest commented Jun 2, 2025

For logging, I have the following recommendations:

  • Check whether the logging levels are appropriate. For example, on line 923 of kernel.py, the message '{os.environ["SCOREP_EXPERIMENT_DIRECTORY"]=}' should not be logged as a warning.
  • Check the consistency of the logging messages. For example, we have variations of the same message:
    'Cell execution failed, cell persistence was not recorded', 'Failed to load cell's persistence to the notebook', 'Failed to pickle notebook's persistence'. These sound relatively similar, although there are differences. Perhaps you could make them more meaningful by adding the context of the persistence step (e.g. Jupyter parent -> Scorep child). You could also add the type of serialisation (memory/disk, serialiser, etc.).
    What could help here is an enum to define the potential error messages and, for the future, hints on how to resolve them.
  • Adjusted logging level in 913df54
  • Unified error message formats in commit 32d2811. Included enum for error types and improved message context as suggested.

@OutlyingWest OutlyingWest reopened this Jun 2, 2025
@OutlyingWest OutlyingWest requested a review from elwer June 2, 2025 13:04
@elwer elwer merged commit d350f21 into master Jun 5, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants