-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stress test counts wrong when killing workers #642
Comments
Seems I ran into this problem before #456 and fixed it incorrectly. The reason I did not properly fix it then is I was making the following incorrect assumption :
However this is not true if the observer runs again and finds there is nothing to do. Mainly in the case where it failed previously, but did complete part of the 2nd phase of commit. In this case it may not recreate the notifications it created last time and those notifications will never be created until something tries to read the locked data (which may never happen). Since the solution in #456 was based on an incorrect premise I may be able to delete the notifications earlier as was done before the "fix" for #456 in addition to setting notifications in the lock phase. |
Thinking about this, if setting notifications during 1st phase of commit then would use start timestamp. Need to be careful that notifications are not deleted when they should not be. Notifications are currently deleted using the start timestamp of a transactions. When writing notification using startTs (instead of commitTs as is currently done) need to do something to avoid a situation like the following where notification created by TX3 is lost.
Another issue is that even if the notification was not deleted in the situation above, the current code would consider it already acknowledged because this based on ACK timestamp vs Notification timestamp. So TX2 would write an ACK with timestamp 14 which would supersede the notification from TX3. |
Was thinking of writing notifications while writing locks, however there is a problem with this. If a transaction writes a notification w/o writing all of its lock, then an observer triggered by the transaction is not guaranteed to see everything the transaction wrote. Can only write a notification after all locks are written. Beginning to think that strong notifications should be written in the same way that weak notifications are currently written. Weak notifications are written after all locks are written, but before finishing the commit on the primary column. In the case of failures the transactions will be rolled back and all notifications rewitten. For strong notifications, need to ensure observers fail when triggered by a notification written for a transaction that fails. |
fixes #642 write strong notifcations at same place as weak
I have been running the stress test and stopping and starting Fluo while the test was running. This causes YARN to kill alll of the workers and the Oracle. After the test completes, sometimes the counts are wrong. However doing a diagnostic scan of the entire table to diff levels of the tree, will cause the counts to become correct. This full scan causes some partially completed transactions to roll forward which causes notifications to be written. These notifications trigger observers which complete the computation.
The problem is that notifications are written in the 2nd phase of commit. I think the solution to this problem is to write notifications in the 1st phase of commit. This way if a transaction is partially successful with the 2nd phase observers will still trigger and roll forward.
The text was updated successfully, but these errors were encountered: