Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation with "lock-and-fetch" #441

Open
2 of 3 tasks
meggarr opened this issue Nov 10, 2023 · 6 comments
Open
2 of 3 tasks

Performance degradation with "lock-and-fetch" #441

meggarr opened this issue Nov 10, 2023 · 6 comments

Comments

@meggarr
Copy link

meggarr commented Nov 10, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue. YOU MAY DELETE THE PREREQUISITES SECTION.

  • I am running the latest version
  • I checked the documentation and found no answer
  • I checked to make sure that this issue has not already been filed

Expected Behavior

lock-and-fetch method can work with large set of schedules

Current Behavior

We see big performance degradation, when there is a task scheduling storm - means having 100k tasks scheduled within 5seconds, the scheduler's lockAndFetch method will try to do below SQL,

UPDATE xxx_scheduled_tasks st1 
  SET picked = true, picked_by = 'abc', last_heartbeat = '20000123T012345.678+0900', version = version + 1
WHERE (st1.task_name, st1.task_instance) IN (
        SELECT st2.task_name, st2.task_instance FROM xxx_scheduled_tasks st2 
        WHERE picked = false and execution_time <= '20000123T012345.678+0900' order by execution_time asc LIMIT 20 FOR UPDATE SKIP LOCKED)
RETURNING st1.* ;

In our case, there are 5 instances running this SQL. Unfortunately, Postgres seems always do "LockRow" before the "Limit", below is a sample SQL execution plan,

explain  
UPDATE xxx_scheduled_tasks st1 
  SET picked = true, picked_by = 'abc', last_heartbeat = '20000123T012345.678+0900', version = version + 1
WHERE (st1.task_name, st1.task_instance) IN (
        SELECT st2.task_name, st2.task_instance FROM xxx_scheduled_tasks st2 
        WHERE picked = false and execution_time <= '20000123T012345.678+0900' order by execution_time asc LIMIT 20 FOR UPDATE SKIP LOCKED)
RETURNING st1.* ;

                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Update on xxx_scheduled_tasks st1  (cost=248.34..254.38 rows=1 width=217)
   ->  Nested Loop  (cost=248.34..254.38 rows=1 width=217)
         ->  HashAggregate  (cost=144.08..144.09 rows=1 width=94)
               Group Key: "ANY_subquery".task_name, "ANY_subquery".task_instance
               ->  Subquery Scan on "ANY_subquery"  (cost=144.05..144.07 rows=1 width=94)
                     ->  Limit  (cost=144.05..144.06 rows=1 width=49)
                           ->  LockRows  (cost=144.05..144.06 rows=1 width=49)
                                 ->  Sort  (cost=144.05..144.05 rows=1 width=49)
                                       Sort Key: st2.execution_time
                                       ->  Seq Scan on xxx_scheduled_tasks st2  (cost=0.00..144.04 rows=1 width=49)
                                             Filter: ((NOT picked) AND (execution_time <= '2000-01-22 16:23:45.678+00'::timestamp with time zone))
         ->  Bitmap Heap Scan on xxx_scheduled_tasks st1  (cost=104.27..108.28 rows=1 width=117)
               Recheck Cond: ((task_name = "ANY_subquery".task_name) AND (task_instance = "ANY_subquery".task_instance))
               ->  Bitmap Index Scan on xxx_scheduled_tasks_pkey  (cost=0.00..104.27 rows=1 width=0)
                     Index Cond: ((task_name = "ANY_subquery".task_name) AND (task_instance = "ANY_subquery".task_instance))

Looking at the plan, it is suspected that the LockRows is before the Limit operation, does this mean that when target rows is a big number it will lock all rows in first place before "Limit" ?

With our case, we have 130k tasks (past due) in DB, the query above took 1hour to complete.. If this is a performance limitation for lock-and-fetch case, maybe it is not caused by DB Scheduler but by PostgreSQL's query planning engine, shall we document this, so people can use this with caution ?

PS: when we switched back to "fetch" polling policy, this problem went away.


For bug reports

YOU MAY DELETE THE For bug reports SECTION IF A NEW FEATURE REQUEST.

Steps to Reproduce the bug

  1. Run 5 instances of DB scheduler (with SpringBoot)
  2. Schedule 100k + tasks within 2 or 3 seconds
  3. Observer the performance of DB Scheduler, we see the query may take 1hour to respond.

Context

  • DB-Scheduler Version : 11.7
  • Java Version : Java 11
  • Spring Boot (check for Yes) : [x]
  • Database and Version : PG 13.10

Logs

@kagkarlsson
Copy link
Owner

Do you have an index on execution_time?

@meggarr
Copy link
Author

meggarr commented Nov 12, 2023

Yes, it has, below is the sample schema,

                Table "public.xxx_scheduled_tasks"
        Column        |           Type           | Collation | Nullable | Default
----------------------+--------------------------+-----------+----------+---------
 task_name            | text                     |           | not null |
 task_instance        | text                     |           | not null |
 task_data            | bytea                    |           |          |
 execution_time       | timestamp with time zone |           | not null |
 picked               | boolean                  |           | not null |
 picked_by            | text                     |           |          |
 last_success         | timestamp with time zone |           |          |
 last_failure         | timestamp with time zone |           |          |
 consecutive_failures | integer                  |           |          |
 last_heartbeat       | timestamp with time zone |           |          |
 version              | bigint                   |           | not null |
Indexes:
    "xxx_scheduled_tasks_pkey" PRIMARY KEY, btree (task_name, task_instance)
    "xxx_execution_time_idx" btree (execution_time)
    "xxx_last_heartbeat_idx" btree (last_heartbeat)

I replaced the real table name with xxx

@kagkarlsson
Copy link
Owner

With our case, we have 130k tasks (past due) in DB, the query above took 1hour to complete

Even if it has to lock 130k rows, that should not take 1hour, that is a bit suspicious.

If it is not using the index I suppose it may be because the statistics is a bit outdated. It thinks it will hit 1 row:

Seq Scan on xxx_scheduled_tasks st2  (cost=0.00..144.04 rows=1 width=49)

@kagkarlsson
Copy link
Owner

Is that explain from when you had 100k rows in the table?

@meggarr
Copy link
Author

meggarr commented Nov 12, 2023

Is that explain from when you had 100k rows in the table?

No, it is not from the problem scene. I paste the explain to try to illustrate only the "LockRow" is before the "Limit" - where I suspect that that's the root cause.

@kagkarlsson
Copy link
Owner

The query plan might look different when there is lots of data in the table. I would be hesitant to draw any conclusions based on an explain plan for an empty table.

I don't have any immediate tips for you. You might need analyze more. Possibly check that the Java-code (ExecutionHandler) is not running into problems or the thread-pool is too small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants