Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](cloud) Add config to control sync_rowsets parallelism when init scanners #49420

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gavinchou
Copy link
Contributor

@gavinchou gavinchou commented Mar 24, 2025

What problem does this PR solve?

It may take long time where there are a lot of tablets in a single scanner, because we need to sync rowsets from MS to BE side before preceeding scanning procedure.

be.conf init_scanner_sync_rowsets_parallelism default 10

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Sorry, something went wrong.

… scanners

be.conf init_scanner_sync_rowsets_parallelism default 10
@gavinchou gavinchou requested a review from dataroaring as a code owner March 24, 2025 10:41
@Thearas
Copy link
Contributor

Thearas commented Mar 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@gavinchou
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34573 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dbacfb2c52a40755465022ac7d96c4b82939d8ae, data reload: false

------ Round 1 ----------------------------------
q1	26254	5189	5070	5070
q2	2081	307	174	174
q3	10377	1269	722	722
q4	10228	1039	559	559
q5	7590	2448	2363	2363
q6	200	162	133	133
q7	935	755	643	643
q8	9336	1359	1174	1174
q9	6899	5193	5140	5140
q10	6827	2323	1909	1909
q11	491	281	277	277
q12	351	366	234	234
q13	17769	3752	3135	3135
q14	227	236	214	214
q15	542	487	483	483
q16	620	614	582	582
q17	574	872	354	354
q18	7552	7183	7096	7096
q19	1212	959	575	575
q20	341	348	204	204
q21	4439	2802	2567	2567
q22	1057	1024	965	965
Total cold run time: 115902 ms
Total hot run time: 34573 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5196	5156	5191	5156
q2	240	324	229	229
q3	2181	2712	2285	2285
q4	1443	1820	1546	1546
q5	4525	4506	4371	4371
q6	215	162	126	126
q7	1967	1915	1795	1795
q8	2577	2544	2618	2544
q9	7335	7212	7212	7212
q10	3001	3189	2755	2755
q11	579	513	484	484
q12	719	765	606	606
q13	3479	3988	3251	3251
q14	278	315	290	290
q15	507	463	483	463
q16	649	672	661	661
q17	1229	1543	1372	1372
q18	7770	7568	7426	7426
q19	865	855	919	855
q20	2064	2059	1877	1877
q21	5203	4748	4693	4693
q22	1069	1019	976	976
Total cold run time: 53091 ms
Total hot run time: 50973 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187991 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dbacfb2c52a40755465022ac7d96c4b82939d8ae, data reload: false

query1	1010	482	500	482
query2	6549	1961	1951	1951
query3	6814	237	222	222
query4	26955	23435	23656	23435
query5	4322	633	480	480
query6	281	213	195	195
query7	4621	487	270	270
query8	316	245	241	241
query9	8589	2614	2610	2610
query10	505	320	263	263
query11	15335	15416	14910	14910
query12	172	109	110	109
query13	1660	542	400	400
query14	9736	6289	6257	6257
query15	203	188	161	161
query16	7635	643	484	484
query17	1171	719	587	587
query18	2014	392	303	303
query19	188	189	149	149
query20	121	120	118	118
query21	209	120	100	100
query22	4147	4203	4281	4203
query23	33865	33041	32965	32965
query24	8459	2379	2363	2363
query25	526	457	395	395
query26	1233	271	143	143
query27	2749	508	321	321
query28	4317	2434	2412	2412
query29	704	575	434	434
query30	289	223	196	196
query31	951	890	814	814
query32	77	70	63	63
query33	581	384	319	319
query34	787	843	502	502
query35	812	823	749	749
query36	960	1003	905	905
query37	121	105	82	82
query38	4376	4110	4148	4110
query39	1472	1405	1396	1396
query40	214	120	109	109
query41	60	61	58	58
query42	131	104	104	104
query43	505	516	476	476
query44	1329	804	801	801
query45	187	173	176	173
query46	837	1034	634	634
query47	1766	1819	1718	1718
query48	387	416	308	308
query49	812	555	459	459
query50	699	742	470	470
query51	4184	4215	4140	4140
query52	101	108	94	94
query53	222	256	187	187
query54	488	484	422	422
query55	78	80	82	80
query56	276	275	266	266
query57	1170	1159	1081	1081
query58	242	235	301	235
query59	2664	2768	2834	2768
query60	290	281	294	281
query61	128	128	132	128
query62	838	726	647	647
query63	239	187	195	187
query64	4209	1016	661	661
query65	4493	4396	4412	4396
query66	1052	411	300	300
query67	15933	15541	15332	15332
query68	9035	890	513	513
query69	473	309	266	266
query70	1243	1141	1100	1100
query71	468	355	270	270
query72	5581	5046	4959	4959
query73	699	559	342	342
query74	8965	9043	9123	9043
query75	4334	3282	2748	2748
query76	3701	1189	747	747
query77	954	374	286	286
query78	9960	10245	9387	9387
query79	2427	819	569	569
query80	681	533	454	454
query81	465	271	222	222
query82	459	128	95	95
query83	204	181	154	154
query84	286	94	72	72
query85	792	365	406	365
query86	330	298	317	298
query87	4394	4548	4253	4253
query88	2961	2289	2299	2289
query89	416	309	286	286
query90	1965	214	220	214
query91	139	143	112	112
query92	73	62	66	62
query93	1718	1038	584	584
query94	665	422	320	320
query95	362	275	266	266
query96	489	579	278	278
query97	3370	3359	3315	3315
query98	221	212	203	203
query99	1463	1388	1305	1305
Total cold run time: 277659 ms
Total hot run time: 187991 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.24 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dbacfb2c52a40755465022ac7d96c4b82939d8ae, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.10	0.11
query3	0.25	0.20	0.19
query4	1.59	0.19	0.20
query5	0.58	0.56	0.58
query6	1.18	0.72	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.59	0.53	0.52
query10	0.58	0.59	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.11
query13	0.62	0.60	0.60
query14	2.68	2.70	2.66
query15	0.93	0.85	0.84
query16	0.38	0.38	0.38
query17	1.05	1.04	1.02
query18	0.21	0.20	0.19
query19	1.95	1.93	1.89
query20	0.02	0.02	0.01
query21	15.35	0.90	0.54
query22	0.75	1.12	0.62
query23	15.09	1.40	0.61
query24	7.03	2.15	0.85
query25	0.48	0.07	0.25
query26	0.65	0.16	0.14
query27	0.05	0.05	0.05
query28	9.32	0.90	0.44
query29	12.51	4.03	3.32
query30	0.26	0.08	0.06
query31	2.81	0.60	0.38
query32	3.22	0.55	0.47
query33	3.02	3.16	3.06
query34	15.75	5.15	4.55
query35	4.58	4.57	4.56
query36	0.68	0.50	0.48
query37	0.09	0.07	0.06
query38	0.04	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.16 s
Total hot run time: 31.24 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 50.22% (13440/26760)
Line Coverage 39.67% (116421/293479)
Region Coverage 38.39% (59186/154162)
Branch Coverage 33.52% (29898/89182)

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Collaborator

@TangSiyang2001 TangSiyang2001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants