-
-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add /refresh to update sources and some configurations #1179
base: main
Are you sure you want to change the base?
Add /refresh to update sources and some configurations #1179
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a good start, thanks! I will have to think of how to test this best - perhaps add a new shell script test-watch.sh
that is similar to existing test.sh
, except that it will:
- start martin with
--auto-bounds skip
(so that it starts faster) - wait for it to start
- get a catalog into catalog1.json
- use
psql
to create a new table and a function, and possibly make some other modifications like deleting something else - use
curl
toPOST
a refresh command to some Martin's endpoint that triggers the refresh - I guess this request should not return until refresh is done? Make sure to have some global "refresh lock" to avoid multiple simultaneous refresh requests? - get another catalog saving it to catalog2.txt
The existing bless/diff setup will check that the catalog results are as expected
Co-authored-by: Yuri Astrakhan <[email protected]>
Some ways to push this forward? |
I'm trying to keep going on this. |
There is a well known dashmap crate that essentially functions as a concurrent hashmap (i.e. replaces |
Weired. That shouldn't fail. CI / Build and test docker images
The source
|
8bbd55f
to
c0a1700
Compare
There must be a lot to clean up. |
/catalog
Hi @nyurik Any guidance or review on this PR? I think the first version has been finished |
Thx @sharkAndshark! I will need a bit of time to review, a bit swamped at the moment, but I don't want to miss anything important for such a critical PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First - awesome work! Some serious time and effort were done on this one. Thx!
Now to the "fun" part:
- We must not have multiple
write()
calls on multiple objects together like you have in the refresh because it is extremely slow - you get a write lock on one object, and try to get a write lock on another object -- while holding on to the first one. Getting a write lock means you have to wait for all the readers to finish. But you already hold on to the first write lock - so no other readers can use the first object while you are waiting. Really bad for performance, and could result in a deadlock in some cases. - It seems
Data<RwLock<T>>
pattern is not good - creates too many complexities with locking. - Instead, we should probably go with
Data<Arc<DashMap<String, Box<dyn Source>>>>
(and similar for sprites and fonts). This way you never need to deal with read/write locks (handled byDashMap
), each source is independent, and theDashMap
instance is always the same. - On refresh, we will need to do these steps:
- discover all new sources, e.g. new PG tables
- for each discovered source, insert it into dashmap under its name
- TBD: I don't know if it makes sense to compare old source with the new source, and only replace if they are different. There might be caching reasons not to replace if the source has not changed.
- TBD: We need to delete sources that no longer exist. Not easy -- we may need to add a
source group ID
to each source -- e.g.pg-1
andpg-2
for the first and second postgres connections. When refreshingpg-1
, we can iterate over all sources, and remove those that havepg-1
and were not auto-discovered.
- To keep things cleaner, we could create a new
TileSources
struct, and maybe implementDeref
on it.
pub struct TileSources(Arc<DashMap<String, Box<dyn Source>>>);
- we should not re-create cache on refresh. Ideally we should only purge objects related to the specific source if that source is replaced. At the moment, I don't think we ever cache combined tiles, so that's not an issue.
In light of all the above, I propose our next steps would be to break this PR into several parts - lets try to do just one simple type of "refreshable" first -- like fonts or sprites. It makes no sense to do all of them at once until we are certain how we can proceed.
P.S. I made a few minor cleanup commits directly to your branch, and merged with the latest main
.
Thx! Let's do it step by step. |
@sharkAndshark can i help with anything on this one - I know it is a bit disappointing that we can't merge it like this, but it will be a massive performance hit if we do it this way. |
Try to fix #288
/refresh
to endpoint(read and merge config and flush the cache)