Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance](mow) async calc delete bitmap in add_segment #48156

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kaijchen
Copy link
Contributor

@kaijchen kaijchen commented Feb 20, 2025

What problem does this PR solve?

Issue Number: DORIS-18705

Problem Summary:

Implement async calculation of delete bitmap in RowsetWriter::add_segment() to reduce wait in segment flush threads.
A 100% load performance improvement was observed in memtable_on_sink_node + mow cases.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 20, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaijchen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32877 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7e205716e2b8af3d0fa182ec6f6bac9e5dbeec2f, data reload: false

------ Round 1 ----------------------------------
q1	17634	5589	5326	5326
q2	2057	306	173	173
q3	10402	1446	756	756
q4	10215	1085	543	543
q5	7519	2772	2550	2550
q6	218	188	141	141
q7	1034	788	615	615
q8	9312	1554	1412	1412
q9	5166	4843	4889	4843
q10	6895	2366	1907	1907
q11	492	280	244	244
q12	355	388	225	225
q13	17765	3801	3136	3136
q14	240	225	211	211
q15	511	457	457	457
q16	619	622	569	569
q17	603	913	348	348
q18	6588	6305	6325	6305
q19	1080	1109	587	587
q20	319	356	195	195
q21	3168	2302	2032	2032
q22	371	343	302	302
Total cold run time: 102563 ms
Total hot run time: 32877 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5495	5443	5471	5443
q2	257	346	227	227
q3	2231	2747	2306	2306
q4	1427	1884	1399	1399
q5	4270	4160	4144	4144
q6	249	178	128	128
q7	1999	1961	1760	1760
q8	2899	2729	2853	2729
q9	7312	7175	7173	7173
q10	3098	3239	2766	2766
q11	645	509	509	509
q12	716	780	598	598
q13	3679	3947	3260	3260
q14	288	300	261	261
q15	520	466	470	466
q16	653	683	640	640
q17	1229	1748	1451	1451
q18	7665	7390	7235	7235
q19	881	858	881	858
q20	2073	2046	1887	1887
q21	5555	5010	4924	4924
q22	613	583	556	556
Total cold run time: 53754 ms
Total hot run time: 50720 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183333 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7e205716e2b8af3d0fa182ec6f6bac9e5dbeec2f, data reload: false

query1	962	374	387	374
query2	6547	1940	1928	1928
query3	6802	217	221	217
query4	26191	23854	23056	23056
query5	4741	688	499	499
query6	312	196	191	191
query7	4607	512	294	294
query8	291	258	215	215
query9	8641	2509	2503	2503
query10	478	295	249	249
query11	15690	15096	14874	14874
query12	173	106	105	105
query13	1676	523	406	406
query14	10007	6251	6221	6221
query15	215	197	189	189
query16	7554	637	458	458
query17	1173	693	537	537
query18	1990	396	307	307
query19	184	188	149	149
query20	118	114	117	114
query21	217	118	99	99
query22	4299	4074	3915	3915
query23	34024	33087	32984	32984
query24	7767	2386	2386	2386
query25	534	468	387	387
query26	1225	267	157	157
query27	2574	471	346	346
query28	4269	2401	2391	2391
query29	709	549	416	416
query30	232	183	159	159
query31	937	865	794	794
query32	74	64	62	62
query33	550	364	285	285
query34	770	839	507	507
query35	807	820	734	734
query36	973	982	918	918
query37	126	99	73	73
query38	4239	4203	4027	4027
query39	1501	1375	1385	1375
query40	208	116	105	105
query41	54	52	50	50
query42	134	101	100	100
query43	498	521	486	486
query44	1277	801	784	784
query45	182	167	161	161
query46	864	1036	665	665
query47	1756	1783	1707	1707
query48	384	405	301	301
query49	781	522	416	416
query50	706	736	420	420
query51	4186	4187	4110	4110
query52	102	113	96	96
query53	230	262	186	186
query54	499	493	436	436
query55	80	86	78	78
query56	288	278	252	252
query57	1131	1138	1038	1038
query58	244	236	266	236
query59	2822	2928	2676	2676
query60	286	269	261	261
query61	123	121	140	121
query62	801	700	690	690
query63	245	201	182	182
query64	4298	1092	665	665
query65	3179	3108	3111	3108
query66	1060	401	305	305
query67	15883	15509	15476	15476
query68	2286	812	560	560
query69	432	311	269	269
query70	1202	1172	1054	1054
query71	343	297	290	290
query72	5547	3719	3643	3643
query73	658	771	362	362
query74	9164	9079	8948	8948
query75	3106	3154	2680	2680
query76	2322	1177	754	754
query77	341	383	279	279
query78	10035	10309	9251	9251
query79	1053	925	602	602
query80	598	546	474	474
query81	487	273	239	239
query82	316	130	97	97
query83	171	188	173	173
query84	249	97	80	80
query85	713	366	311	311
query86	302	297	292	292
query87	4325	4658	4467	4467
query88	2905	2224	2255	2224
query89	384	320	283	283
query90	1836	197	196	196
query91	138	150	111	111
query92	67	61	57	57
query93	1074	997	589	589
query94	475	414	312	312
query95	353	276	268	268
query96	510	528	280	280
query97	2726	2856	2696	2696
query98	219	211	206	206
query99	1313	1458	1274	1274
Total cold run time: 262130 ms
Total hot run time: 183333 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.13 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7e205716e2b8af3d0fa182ec6f6bac9e5dbeec2f, data reload: false

query1	0.05	0.03	0.03
query2	0.07	0.04	0.03
query3	0.24	0.06	0.07
query4	1.61	0.10	0.11
query5	0.41	0.42	0.40
query6	1.16	0.66	0.64
query7	0.03	0.01	0.02
query8	0.04	0.03	0.03
query9	0.60	0.51	0.52
query10	0.58	0.58	0.58
query11	0.16	0.10	0.10
query12	0.15	0.11	0.11
query13	0.62	0.59	0.60
query14	2.71	2.73	2.73
query15	0.93	0.84	0.84
query16	0.38	0.40	0.37
query17	1.04	1.08	1.03
query18	0.21	0.19	0.19
query19	1.96	1.97	1.80
query20	0.02	0.01	0.02
query21	15.35	0.94	0.56
query22	0.76	1.20	0.72
query23	14.84	1.39	0.65
query24	7.90	1.80	0.38
query25	0.51	0.29	0.07
query26	0.66	0.16	0.13
query27	0.06	0.05	0.05
query28	9.45	0.85	0.44
query29	12.53	3.97	3.31
query30	0.26	0.09	0.06
query31	2.82	0.59	0.37
query32	3.23	0.59	0.47
query33	2.98	2.98	3.04
query34	15.80	5.11	4.48
query35	4.48	4.53	4.50
query36	0.68	0.50	0.50
query37	0.09	0.07	0.06
query38	0.06	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.13	0.12
query41	0.08	0.02	0.02
query42	0.03	0.03	0.02
query43	0.03	0.04	0.02
Total cold run time: 105.78 s
Total hot run time: 30.13 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 43.83% (11641/26557)
Line Coverage: 33.75% (97569/289107)
Region Coverage: 32.85% (49957/152076)
Branch Coverage: 28.56% (25109/87908)
Coverage Report: http://coverage.selectdb-in.cc/coverage/7e205716e2b8af3d0fa182ec6f6bac9e5dbeec2f_7e205716e2b8af3d0fa182ec6f6bac9e5dbeec2f/report/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants