Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor proxies #1952

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from
Draft

Refactor proxies #1952

wants to merge 15 commits into from

Conversation

dsblank
Copy link
Member

@dsblank dsblank commented Feb 10, 2025

This PR (targeting 6.1) is a refactor of the proxies to solve the following issues.

These new proxies:

  1. will never return None for an object lookup (eg, db.get_person_from_handle())
  2. will not re-create an object more than once (if object is in cache)
  3. work correctly when stacking proxy on proxy, in any order
  4. requires an initial load (loop through data) but is much faster for nested proxies

Speed results:

On a database of 44k people, here are the timing results of this PR (6.1) versus 6.0. Times are in seconds. Times are +/- a few seconds.

Note that for all but the last row, the times are very similar. The real benefit is with nested proxies, which is common for export and the narrative report. Here we see that this PR beats 6.0 by over 13x. That is the difference between waiting 7 minutes, versus 30 seconds.

Proxies time (version 6.0) time (version 6.1) Matches
referenced(db) 31 33 39,442
referenced(db, all) 41 44 39,569
private(db) 3 1 29,561
living(db) 19 13 34,760
private(living(db)) 16 30 26,001
living(private(db)) 21 20 26,001
filter(db) 5 2 20,424
private(filter(db)) 24 13 15,256
filter(private(db)) 6 10 15,256
referenced(living(private(db))) 417 31 29,463

@dsblank dsblank marked this pull request as ready for review February 11, 2025 17:04
@DavidMStraub
Copy link
Member

The speedups are great, but is there a way to make this work while still having proxy DBs inherit from DbReadBase? We should have some way to define an "AnyReadonlyDatabase" for functions to use that shouldn't care whether it's proxied or not.

If not a common base class, we would need a protocol I think.

@dsblank
Copy link
Member Author

dsblank commented Feb 12, 2025

but is there a way to make this work while still having proxy DBs inherit from DbReadBase?

So these are wrappers, not subclasses. The difference is that these don't need to re-implement methods that are NotImplemented, but just call the instance methods, whatever they are.

If not a common base class, we would need a protocol I think.

That sounds like a fine PR to propose.

@dsblank dsblank marked this pull request as draft February 13, 2025 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants