-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-31636 Memory leak in Sys_var_gtid_binlog_state::do_check() #3523
Conversation
|
90eab64
to
98ebf41
Compare
sql/sys_vars.cc
Outdated
{ | ||
if (!var->save_result_exists) | ||
return; | ||
auto data= (gtid_binlog_state_data *)var->save_result.ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptr
is allocated in do_check()
somewhat quite surprisingly 'cos it all looks to me this could be done inside
`var->update().
I'd consider that first. If turns true indeed, we naturally shall carefully check all existing class' do_check()
(to my browsing there's none but ours replication's).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, though it logically makes sense for the actual validation of the input value to be in do_check()
. Sys_var_gtid_slave_pos
has a slightly different, yet still consistent to check
semantics pattern, which in Sys_var_gtid_slave_pos::do_check()
validates the value and saves the string on thd
, and then uses that string in Sys_var_gtid_slave_pos::global_update()
in a load. Can we follow that pattern here? (Though @DaveGosselin-MariaDB I recall in our discussion you mentioned trying to save something on the thread, and it not working. Is this what you tried?).
Generally speaking though, I' say whichever pattern we pick, Sys_var_gtid_binlog_state
and Sys_var_gtid_slave_pos
should probably follow the same pattern in the source code for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Brandon, I wasn't saving something on the thread but rather marking variables such that an on_error
phase could be executed over all affected variables so that they may handle any cleanup in their own implementation-dependent way (see 90eab64 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnestere, a good reminder point Sys_var_gtid_slave_pos::do_check
!
Indeed both variables need be consistent wrt do_check()
actions.
They both allocate and are different in the style of allocation. Sys_var_gtid_slave_pos
's
if (!(var->save_result.string_value.str=
thd->strmake(res->ptr(), res->length())))
{
my_error(ER_OUT_OF_RESOURCES, MYF(0));
return true;
}
is comparable
with a block the current patch relocated.
I suggest we do the same. res
would be computed in Sys_var_gtid_slave_pos::global_update
as defined here res= var->value->val_str(&str).
@DaveGosselin-MariaDB, your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bnestere and @andrelkin! Sys_var_gtid_slave_pos::do_check
leaks memory in the sense that the result of strmake
will live on the THD memroot indefinitely in the case of a failure like the one in the attached test. Consequently, it seems that making Sys_var_gtid_binlog_state
behave similarly won't fix the leak but will hide it instead. For the handle_*
replication threads, the memroot's lifetime extends beyond the current query, correct? If so, we could potentially have leaks build up for the lifetime of the process, not just for the query's lifetime.
I think we could either (1) deliver my changes from 90eab64 or (2) keep the current changes and modify Sys_var_gtid_slave_pos::do_check
to avoid leaking memory in the case of a failure by applying to it the same idea that we applied to Sys_var_gtid_binlog_state::do_check
in the current commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, great point, we probably shouldn't hide the leak either. I think the on_error()
approach offers a good amount of flexibility. Though before we make a decision, I'll throw a third option into the mix: 3) add a MEM_ROOT
to set_var
so we can allocate within the context of one SET
(and automatically free it in set_var::~set_var()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree (1) is more flexible. (2) is just simpler as so far it has been sufficient. I would pick (2) to hold on with (1) until any first incident that leaves no other, incl (1), choices is met.
To (3), I am reluctant. Indeed the THD::main_mem_root
that is used here gets reset at the end of the statement (incl a SET one with a failing sub-SET).
I would propose (4) to keep (2) fool-proof with consting do_check( ..., const set_var *var)
the 2nd arg. But I am not really proposing, as the signature change looks super-pervasive.
Also this measure of course would not block altering other objects (in replication uses case mysql_bin_log
or rpl_slave_state
could become victims of that). That's why I think such altering actions should be a part of global_update()
:s rather than a part of (1) 's on_error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep (2) but without adding the const
then the change as it is should be fine to merge because of my reasoning from #3523 (comment) . If that's the case, then let's deliver this change as it is. We could create a ticket to address broader architectural changes as a separate effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I see I have an OK from Andrei in a separate comment on this PR, can I take that to mean that it's OK to push?). #3523 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amen!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dave for a good piece. It all seems to provide a solution. However let's look at the issue at a different angle I am suggesting in a note.
Thank you Andrei 😄 I will investigate the other angle. |
98ebf41
to
257a8e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I am thinking though about Sys_var_gtid_slave_pos::do_check()
that @bnestere mentions.
Hi!
On Wed, 18 Sept 2024, 20:27 'andrelkin' via Michael Widenius, < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sql/sys_vars.cc
<#3523 (comment)>:
> @@ -2013,6 +2013,20 @@ Sys_gtid_strict_mode(
struct gtid_binlog_state_data { rpl_gtid *list; uint32 list_len; };
+void
+Sys_var_gtid_binlog_state::on_error(set_var *var)
+{
+ if (!var->save_result_exists)
+ return;
+ auto data= (gtid_binlog_state_data *)var->save_result.ptr;
Please so not use auto. Better to write explicit types to make it more
human readable.
Regards,
Monty
… Please
.
|
Move memory allocations performed during Sys_var_gtid_binlog_state::do_check to Sys_var_gtid_binlog_state::global_update where they will be freed before the latter method returns.
257a8e3
to
3eaa38e
Compare
Move memory allocations performed during Sys_var_gtid_binlog_state::do_check to Sys_var_gtid_binlog_state::global_update where they will be freed before the latter method returns.