Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal code changes to support type hinting of db methods #1934

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

stevenyoungs
Copy link
Contributor

These are the code changes split out from #1919

Each commit fixes one thing.
The first 7 commits are required for mypy to pass the type hinted code which will be introduced in a later PR #1919
The last 3 commits fix a spelling mistake and rename parameters for consistency and clarity. These could be moved to a separate PR if required.

@stevenyoungs stevenyoungs force-pushed the typing-db-code branch 2 times, most recently from 5eda9cd to 459df7e Compare February 6, 2025 21:52
@DavidMStraub
Copy link
Member

Some really nice examples in there of bugs that could happen if None is not handled properly.

@Nick-Hall
Copy link
Member

The get_<object>_from_handle methods should never return None in base databases. We should be annotating the return from "get_person_from_handle" as "Person" not "Person | None". Then changes in your commit "Correctly handle None return from get__from_handle" would no longer be required. Returning empty objects instead of None from the proxies would be a better solution.

@DavidMStraub
Copy link
Member

DavidMStraub commented Feb 7, 2025

The get_<object>_from_handle methods should never return None in base databases. We should be annotating the return from "get_person_from_handle" as "Person" not "Person | None". Then changes in your commit "Correctly handle None return from get__from_handle" would no longer be required. Returning empty objects instead of None from the proxies would be a better solution.

This is an excellent example how the type hinting can identify poor design choices, such as proxies returning None.

But I also don't think returning an empty object is good. Would it have a handle? Probably, because we used get_from_handle. Would it have a gramps ID? Certainly not. Now we need code to handle primary objects which, unexpectly, don't have Gramps IDs etc.

In my opinion, the cleanest solution is to raise HandleError. Then all DbReadBase descendants behave in the same way.

@stevenyoungs
Copy link
Contributor Author

stevenyoungs commented Feb 7, 2025

As it stands, for an empty PrimaryObject the handle is None and gramps_id is None

The comment in TableObject
# If source is None, the handle is assigned as an empty string.
does not match the current code
self.handle = None

Which should it be?

@Nick-Hall
Copy link
Member

Nick-Hall commented Feb 7, 2025

We have known for years that proxies returning None is a bad idea. That's why I don't want get_person_from_handle annotated as returning "Person | None" for example.

I introduced the HandleError back in 2015 as a first step to eliminating the use of None. I have no objection to raising a HandleError or perhaps another exception instead of returning None.

I'm only objecting to the first commit in this PR.

@stevenyoungs
Copy link
Contributor Author

stevenyoungs commented Feb 7, 2025

If it is ultimately considered an error then I'll revise the code to raise an exception of a new type. It that turns out to be too painful, we always have the option to backtrack and return an empty object instead. By using a new exception type, we can identify the locations that were modified.

Is that a reasonable approach?

@stevenyoungs
Copy link
Contributor Author

It could get interesting to start raising an exception instead of returning None here:

if person and not person.get_privacy():
return sanitize_person(self.db, person)
return None

@Nick-Hall
Copy link
Member

Is that a reasonable approach?

The best approach is for a proxy never to return a handle for an object that needs to be filtered out. However, an interim solution is also acceptable, as long as we are heading in the right direction.

Returning fully or partially redacted objects and raising exceptions are alternatives to returning None.

@Nick-Hall Nick-Hall added this to the v6.1 milestone Feb 7, 2025
@kulath
Copy link
Member

kulath commented Feb 8, 2025

In my opinion, the cleanest solution is to raise HandleError. Then all DbReadBase descendants behave in the same way.

Yes, excellent solution.

HandleError instead of any other exception is much better. You could identify each location that is modified with a FIXME to note that a change is needed so "a proxy never to return a handle for an object that needs to be filtered out".

@dsblank
Copy link
Member

dsblank commented Feb 8, 2025

It could get interesting to start raising an exception instead of returning None here:

Yes, that should not return None. I'll wait to fix that those... I'm just going to fix bugs in #1941

@stevenyoungs stevenyoungs force-pushed the typing-db-code branch 2 times, most recently from 7e71978 to 24c50a0 Compare February 8, 2025 16:22
@@ -866,7 +866,7 @@ class object of a primary object.
"event": self.db.has_event_handle,
"place": self.db.has_place_handle,
"source": self.db.has_source_handle,
"citation": self.db.get_raw_citation_data,
"citation": self.db.has_citation_handle,
Copy link
Contributor Author

@stevenyoungs stevenyoungs Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nick-Hall this one might be worth worth pushing to 6.0 as well. Let me know and I can make a separate PR

I think it is more of a performance problem, unnecessarily getting the citation DataDict, just to check if the handle is a valid citation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Something like this is obviously a mistake and just a single line change. I think that we can accept it as a bug even if it isn't actually causing a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll split out later.

@stevenyoungs
Copy link
Contributor Author

The original first commit,

Correctly handle None return from <get_object>_from_handle
when type hints are added, mypy will detect an error without this change

which provoked much discussion is now dropped from this PR 🎆
Thanks to all commentors for the ideas and encouragement to keep pushing

I think this PR ready for review

@stevenyoungs stevenyoungs force-pushed the typing-db-code branch 2 times, most recently from 5294897 to 38883b7 Compare February 9, 2025 08:53
…check for handle existence. As a result it can get the object directly rather then convert the raw data into an object
@stevenyoungs
Copy link
Contributor Author

libmixin.py now has the following code:

        handle = str(handle)
        if has_obj_handle(handle):
            obj = get_obj_from_handle(handle)

I'm hypothesising that

  1. libmixin clients are not randomly making up handle values
    any valid handle value must have originated from the DB and therefore exist
  2. it is not necessary to call has_obj_handle.
    Either
    a. handle is None or an empty string
    => does not refer to an object
    b. the object referred to by handle will exist

The only way #1 can be wrong is through object deletion, after the handle has been obtained. In the context of this code, I do not believe that this happens.

Your thoughts welcome, but I'm not making this change in this PR!

@stevenyoungs stevenyoungs mentioned this pull request Feb 9, 2025
@dsblank
Copy link
Member

dsblank commented Feb 9, 2025

The only way #1 can be wrong is through object deletion, after the handle has been obtained. In the context of this code, I do not believe that this happens.

I hope that reference handles are removed, but I am not 100% sure. If there are left over handles, I think we should treat that as a bug, and that the database is corrupted.

In future, this allows the return type hint of <object> | None for consistency with other db implementations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants