Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add group consistency bug test #6021

Closed
wants to merge 1 commit into from

Conversation

link2xt
Copy link
Collaborator

@link2xt link2xt commented Oct 3, 2024

No description provided.

@link2xt link2xt added the bug Something is not working label Oct 3, 2024
@link2xt link2xt force-pushed the link2xt/group-consistency-bug branch from e837ced to d1595c0 Compare October 3, 2024 21:52
remove_contact_from_chat(alice, alice_chat_id, alice_fiona_contact_id).await?;
let _alice_sent_add_msg = alice.pop_sent_msg().await;

SystemTime::shift(Duration::from_secs(3600));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time shift bypasses the fix introduced in #5376 to fix #5366

@link2xt
Copy link
Collaborator Author

link2xt commented Oct 4, 2024

The problem with the logic introduced in #5376 is that it assumes that if timestamp_sent of the message is newer than the last membership change in the chat, then the user has already received all membership changes and was aware of them when sending. In fact user may have sent the message while being offline and will only receive membership changes later when going offline, or not at all if they are not delivered.

I think instead of timestamp_sent we should look at the timestamp_sent of the message referenced by In-Reply-To. If it is older than the timestamp of the most recent member list change, then user has not received the message with the change yet and we should not allow them to change member list.

The logic is completely contained here, could be reordered to first try to find the parent and then use its timestamp instead of timestamp_sent of just received message:

let member_list_ts = match !is_partial_download && is_from_in_chat {
true => Some(chat_id.get_member_list_timestamp(context).await?),
false => None,
};
// When we remove a member locally, we shift `MemberListTimestamp` by `TIMESTAMP_SENT_TOLERANCE`
// into the future, so add some more tolerance here to allow remote membership changes as well.
let timestamp_sent_tolerance = constants::TIMESTAMP_SENT_TOLERANCE * 2;
let allow_member_list_changes = member_list_ts
.filter(|t| {
*t <= mime_parser
.timestamp_sent
.saturating_add(timestamp_sent_tolerance)
})
.is_some();
let sync_member_list = member_list_ts
.filter(|t| *t <= mime_parser.timestamp_sent)
.is_some();
// Whether to rebuild the member list from scratch.
let recreate_member_list = {
// Always recreate membership list if SELF has been added. The older versions of DC
// don't always set "In-Reply-To" to the latest message they sent, but to the latest
// delivered message (so it's a race), so we have this heuristic here.
self_added
|| match mime_parser.get_header(HeaderDef::InReplyTo) {
// If we don't know the referenced message, we missed some messages.
// Maybe they added/removed members, so we need to recreate our member list.
Some(reply_to) => rfc724_mid_exists_ex(context, reply_to, "download_state=0")
.await?
.filter(|(_, _, downloaded)| *downloaded)
.is_none(),
None => false,
}
} && (
// Don't allow the timestamp tolerance here for more reliable leaving of groups.
sync_member_list || {
info!(
context,
"Ignoring a try to recreate member list of {chat_id} by {from_id}.",
);
false
}
);

@iequidoo
Copy link
Collaborator

iequidoo commented Oct 4, 2024

I think instead of timestamp_sent we should look at the timestamp_sent of the message referenced by In-Reply-To. If it is older than the timestamp of the most recent member list change, then user has not received the message with the change yet and we should not allow them to change member list.

The problem is that the referenced message might be also sent being offline, or by another user not having recent messages in the chat.

EDIT: The solution from #5376 (comment) should fix the observed bug as well, but it requires adding a new column msgs.timestamp_members, that's why i still haven't tried to implement it.

@link2xt
Copy link
Collaborator Author

link2xt commented Oct 4, 2024

The problem is that the referenced message might be also sent being offline, or by another user not having recent messages in the chat.

But at least, assuming everyone's clock is in sync and message delivery is instant once user gets online, if you receive a message referencing parent message that has sent timestamp x, you can tell user has downloaded all messages that were delivered up to time x. If membership changed last time at time y < x, then user was up to date, otherwise you cannot trust them with membership changes.

EDIT: If this logic is sufficient to fix this particular test case without breaking other test cases we already have, then I'd rather implement it instead of adding new columns, and close the issue (well, PR) until we hit another problem in real chats.

@iequidoo
Copy link
Collaborator

iequidoo commented Oct 5, 2024

But at least, assuming everyone's clock is in sync and message delivery is instant once user gets online, if you receive a message referencing parent message that has sent timestamp x, you can tell user has downloaded all messages that were delivered up to time x.

It's not clear whether the user downloaded all messages if the user's message reference only their previous messages because currently we add to References OutPending and even OutFailed messages. I.e. if the user sends multiple messages from offline, we should look at the first message to tell that. But it would be a recursive algorithm which is better to avoid even if that requires adding new db columns.

@link2xt link2xt force-pushed the link2xt/group-consistency-bug branch from d1595c0 to 5176025 Compare December 14, 2024 01:55
@link2xt link2xt force-pushed the link2xt/group-consistency-bug branch from 5176025 to 7e693e8 Compare December 14, 2024 23:05
@iequidoo
Copy link
Collaborator

    let sync_member_list = member_list_ts                                                                                                                                                                            
        .filter(|t| *t <= mime_parser.timestamp_sent)
        .is_some();

This is the code because of which the new test fails (Fiona is "synced back"). We can change it to look not at mime_parser.timestamp_sent, but at the parent message timestamp_sent and also check that the parent message is from another member. But that also may not work properly because the parent message may be from another member also being offline for a long time. I'd say it's unclear what "offline" means, e.g. a temporary network split occurs, but finally all members receive each other's messages, how to merge group member lists then?

@link2xt
Copy link
Collaborator Author

link2xt commented Dec 15, 2024

If looking at the parent timestamp fixes the test without breaking other tests then it's good enough to close this PR/issue until we get another problem in a real group chat. I would also not add any conditions like "is it another member message", if the message referenced is OutPending or OutFailed then it does not exist on the receiver device (OutPending actually may because we may have sent it but got disconnected before getting SMTP server response) anyway and we cannot look at its timestamp.

@iequidoo
Copy link
Collaborator

I just want to say that looking at the parent message timestamp apparently doesn't fix the problem because Bob in this test can send two messages while being offline and the second message would have as a parent the first one (if Bob is offline, his first message would be OutPending, but anyway it would be referenced by his second message).

@link2xt link2xt self-assigned this Dec 23, 2024
@link2xt
Copy link
Collaborator Author

link2xt commented Dec 23, 2024

Assigning this to myself so nobody tries to fix it in ad-hoc way.

Plan for fixing group consistency is at https://github.com/chatmail/specs/tree/main/group-membership, I'll open an issue describing the algorithm in detail (with header names, how to handle non-DC and legacy DC messages etc.) at some point.

EDIT: the issue for the new algorithm is at #6401

@link2xt
Copy link
Collaborator Author

link2xt commented Jan 5, 2025

I moved the test into #6404 and plan to fix it there.

@link2xt link2xt closed this Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants