Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging framework #131

Merged
merged 11 commits into from
Oct 21, 2024
Merged

Add logging framework #131

merged 11 commits into from
Oct 21, 2024

Conversation

snbianco
Copy link
Collaborator

@snbianco snbianco commented Oct 14, 2024

Adds a logging framework. A few things to note:

  1. The logger's default level is INFO, and most of the messages are printed to the DEBUG level.
  2. Previously, we've been using a verbose parameter to log intermediate information. I didn't want to change the API by removing that parameter, so I wrote a function to set the log level based on the value of verbose. I'm conflicted on whether to deprecate the verbose parameters or leave them in for the sake of user-friendliness. As it is now, the verbose parameter overrides any level that the user sets the log to. Still, I think more users would know how to use verbose rather than set the log level themselves. Any thoughts are appreciated!
  3. I restructured __init__.py to better separate logic.
  4. I made a quick change to test_asdf_cut.py since one of the tests was leaving behind an output file.

snbianco and others added 9 commits October 4, 2024 16:43
* progress on tesscut code

* send warnings, return array of filepaths

* More generalization, print messages

* comment fix

* style fixes

fix type annotation

* rename module, use same thread number for cube_cut

Use json.load()

documentation, set threads to 8 for cube_cut

* unit tests, don't allow cutout size of 0

* fix example in docstring

* Use mock in unit tests

* rename mock

* documentation, fix cache position

* remove thread pool, filter ffi footprint files in test data

* Add cachetools as requirement

* work-around for footprint files using TAP

* Remove mock from footprint tests

* Delete sample footprints

* remove cloud function, make crossmatch and ffi fetch functions public

* Remove CIRCLE from example inputs

* parametrize tests
@snbianco snbianco requested a review from havok2063 October 15, 2024 14:18
@snbianco snbianco marked this pull request as ready for review October 15, 2024 14:19
@snbianco snbianco requested a review from falkben October 15, 2024 14:19
havok2063
havok2063 previously approved these changes Oct 15, 2024
Copy link
Contributor

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All your changes here seem reasonable to me. I'm not sure if there were any print statements when errors occurred, but if so, log.error might be helpful to capture them and differentiate from info messages. Also, are there any messages we'd like to display to the user even when verbosity is off? If so, we could reshuffle the levels from info/warning to debug/info.

I agree that verbose is a little more user-friendly than dealing with log levels. If you wanted to retain a verbose flag but remove the boilerplate use in all the functions, one option could be to set up an astrocut user configuration file, like an ini or yaml, and move the verbose flag there. This would allow _handle_verbose to be set once globally. The upside is it's a convenient place to have all user config in one place. A user could also set it and forget it. The downside is it can be a bit clunkier for users to change the verbosity on the fly when running a function. I'm not suggesting we need to make any changes here though, but it could be something to consider in the future, especially if we think the list of user configurable settings might grow.

@@ -216,3 +217,7 @@ def get_fits(cutout_hdus, center_coord=None, output_path=None):
return cutout_hdulist


def _handle_verbose(verbose: bool):
"""Set the log level according to the verbose parameter"""
level = logging.INFO if verbose else logging.WARNING
Copy link
Member

@falkben falkben Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rework the levels here.

verbose should turn on DEBUG level.

and the default should probably include INFO

so anything that was always printed to the screen should be INFO or above. Anything that was gated by verbose should be DEBUG.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in newest commit!

@falkben
Copy link
Member

falkben commented Oct 15, 2024

I agree that verbose is a little more user-friendly than dealing with log levels. If you wanted to retain a verbose flag but remove the boilerplate use in all the functions, one option could be to set up an astrocut user configuration file, like an ini or yaml, and move the verbose flag there. This would allow _handle_verbose to be set once globally. The upside is it's a convenient place to have all user config in one place. A user could also set it and forget it. The downside is it can be a bit clunkier for users to change the verbosity on the fly when running a function. I'm not suggesting we need to make any changes here though, but it could be something to consider in the future, especially if we think the list of user configurable settings might grow.

astropy apparently has a config system already. i am not familiar with it, but perhaps we could also use it (or maybe AstropyLogger already does?)

https://docs.astropy.org/en/stable/config/index.html

if verbose and not is_anon:
print(f'Attempting to access private S3 bucket: {s3_path.bucket}')
if not is_anon:
log.info(f'Attempting to access private S3 bucket: {s3_path.bucket}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of formatting the string and passing it into the logger, as you have here with the fstring, and below with the str.format() calls, it is best practice to let the logger do the formatting. this is for security (in some places we do call astrocut as a web service, so some of these inputs may be untrusted and we don't want them logged onto a server), but there's another reason as well. If the logging level is skipped (e.g. Debug) the logger can avoid doing any str formatting if it's passed in this way, whereas if you format the string before passing it into the logger, you are always doing that (small) bit of work.

so in this case, my suggestion would be:

Suggested change
log.info(f'Attempting to access private S3 bucket: {s3_path.bucket}')
log.info('Attempting to access private S3 bucket: %s', s3_path.bucket')

And should be applied to all the log formatting here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since AstropyLogger doesn't support lazy formatting, I configured a new logger for astrocut with a similar formatter.

@snbianco
Copy link
Collaborator Author

I created a new ticket to set up a user configuration file here: https://jira.stsci.edu/browse/ASB-29237

@snbianco snbianco requested a review from AlexReedy October 21, 2024 14:19
havok2063
havok2063 previously approved these changes Oct 21, 2024
Copy link
Contributor

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@@ -844,8 +843,9 @@ def cube_cut(
if unsuccessful returns None.
"""

if verbose:
start_time = time()
start_time = time()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge deal, but we should be using time.monotonic() or time.perf_counter() here instead.

https://docs.python.org/3/library/time.html#time.monotonic

Comment on lines 905 to 906
log.debug("Maximum distance between approximate and true location: %s", max_dist)
log.debug("Error in approximate WCS (sigma): %f", sigma)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we format with %d or %f?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using %d for integers, %f for floats, and %s for other types. I added a precision to line 906 in the latest commit.

@@ -8,7 +8,7 @@

from time import time

from astropy import log
from astropy import log as astropy_log
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this still be using the astropy log?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think because the _hducut function is using the logs from Astropy to determine whether or not a FITS file has SIP keywords in the WCS. I didn't write this code originally, but as far as I can tell from the Astropy source code, there isn't really a way around this without rerouting the logs: https://github.com/astropy/astropy/blob/main/astropy/wcs/wcs.py#L1299

Copy link
Collaborator

@AlexReedy AlexReedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@snbianco snbianco merged commit 4c549a8 into main Oct 21, 2024
8 checks passed
@snbianco snbianco deleted the logging branch October 21, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants