-
-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance degradation with "lock-and-fetch" #441
Comments
Do you have an index on |
Yes, it has, below is the sample schema,
I replaced the real table name with |
Even if it has to lock 130k rows, that should not take 1hour, that is a bit suspicious. If it is not using the index I suppose it may be because the statistics is a bit outdated. It thinks it will hit 1 row:
|
Is that explain from when you had 100k rows in the table? |
No, it is not from the problem scene. I paste the explain to try to illustrate only the "LockRow" is before the "Limit" - where I suspect that that's the root cause. |
The query plan might look different when there is lots of data in the table. I would be hesitant to draw any conclusions based on an explain plan for an empty table. I don't have any immediate tips for you. You might need analyze more. Possibly check that the Java-code (ExecutionHandler) is not running into problems or the thread-pool is too small. |
Prerequisites
Please answer the following questions for yourself before submitting an issue. YOU MAY DELETE THE PREREQUISITES SECTION.
Expected Behavior
lock-and-fetch
method can work with large set of schedulesCurrent Behavior
We see big performance degradation, when there is a task scheduling storm - means having 100k tasks scheduled within 5seconds, the scheduler's
lockAndFetch
method will try to do below SQL,In our case, there are 5 instances running this SQL. Unfortunately, Postgres seems always do "LockRow" before the "Limit", below is a sample SQL execution plan,
Looking at the plan, it is suspected that the
LockRows
is before theLimit
operation, does this mean that when target rows is a big number it will lock all rows in first place before"Limit"
?With our case, we have 130k tasks (past due) in DB, the query above took 1hour to complete.. If this is a performance limitation for
lock-and-fetch
case, maybe it is not caused by DB Scheduler but by PostgreSQL's query planning engine, shall we document this, so people can use this with caution ?PS: when we switched back to
"fetch"
polling policy, this problem went away.For bug reports
YOU MAY DELETE THE
For bug reports
SECTION IF A NEW FEATURE REQUEST.Steps to Reproduce the bug
Context
Logs
The text was updated successfully, but these errors were encountered: