Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use new Compressor::rebuild_from for FSST pushdown #2708

Merged
merged 6 commits into from
Mar 17, 2025

Conversation

a10y
Copy link
Contributor

@a10y a10y commented Mar 14, 2025

The previous implementation of FSST comparison pushdown relied on rebuilding a compressor by reinserting symbols one-by-one into the CompressorBuilder, and then building it.

That doesn't work, for reasons described in the description at spiraldb/fsst#84.

We use the new rebuild_from API on fsst compressor to build a new compressor that is guaranteed to preserve symbol table ordering, and thus guarantee equal compression outputs.

@a10y a10y added the benchmark Run benchmarks on this branch label Mar 14, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Mar 14, 2025
Copy link
Contributor

Benchmarks: random_access

Table of Results
name PR 286dbe6 base 1d4601c ratio (PR/base) unit
random-access/vortex-tokio-local-disk 1919368 2.26831e+06 0.846168 ns
random-access/parquet-tokio-local-disk 249945926 2.38851e+08 1.04645 ns

Copy link
Contributor

Benchmarks: TPC-H on NVME

Table of Results
name PR 286dbe6 base 1d4601c ratio (PR/base) unit
tpch_q01/arrow 46652531 4.6585e+07 1.00145 ns
tpch_q02/arrow 55590210 4.74152e+07 1.17241 ns
tpch_q03/arrow 32894314 3.12492e+07 1.05264 ns
tpch_q04/arrow 22165153 2.29485e+07 0.965867 ns
tpch_q05/arrow 52763482 4.87438e+07 1.08247 ns
tpch_q06/arrow 11144933 9.80795e+06 1.13632 ns
tpch_q07/arrow 84586488 7.52608e+07 1.12391 ns
tpch_q08/arrow 60347903 5.72192e+07 1.05468 ns
tpch_q09/arrow 76735271 7.215e+07 1.06355 ns
tpch_q10/arrow 51337504 4.59104e+07 1.11821 ns
tpch_q11/arrow 25488605 2.46802e+07 1.03276 ns
tpch_q12/arrow 28172936 2.86841e+07 0.982178 ns
tpch_q13/arrow 18655450 1.62608e+07 1.14726 ns
tpch_q14/arrow 17684690 1.49824e+07 1.18036 ns
tpch_q15/arrow 32332022 2.74913e+07 1.17608 ns
tpch_q16/arrow 24457704 2.29161e+07 1.06727 ns
tpch_q17/arrow 66402286 6.0779e+07 1.09252 ns
tpch_q18/arrow 111376245 9.84398e+07 1.13141 ns
tpch_q19/arrow 32150605 2.90586e+07 1.10641 ns
tpch_q20/arrow 41091897 3.41785e+07 1.20227 ns
tpch_q21/arrow 128295015 1.15878e+08 1.10716 ns
tpch_q22/arrow 17132323 1.47333e+07 1.16283 ns
tpch_q01/parquet 130394685 1.14717e+08 1.13667 ns
tpch_q02/parquet 130011916 1.12823e+08 1.15235 ns
tpch_q03/parquet 118168739 1.07729e+08 1.09691 ns
tpch_q04/parquet 67449464 6.29047e+07 1.07225 ns
tpch_q05/parquet 130981628 1.2016e+08 1.09006 ns
tpch_q06/parquet 28252776 2.63951e+07 1.07038 ns
tpch_q07/parquet 152017253 1.35416e+08 1.1226 ns
tpch_q08/parquet 167986584 1.58022e+08 1.06306 ns
tpch_q09/parquet 231867309 2.13711e+08 1.08496 ns
tpch_q10/parquet 144737233 1.35025e+08 1.07193 ns
tpch_q11/parquet 61680219 5.33508e+07 1.15613 ns
tpch_q12/parquet 102828076 9.48911e+07 1.08364 ns
tpch_q13/parquet 169558201 1.53652e+08 1.10352 ns
tpch_q14/parquet 49718519 4.51545e+07 1.10108 ns
tpch_q15/parquet 81085407 6.5206e+07 1.24353 ns
tpch_q16/parquet 57244196 5.22307e+07 1.09599 ns
tpch_q17/parquet 151878039 1.34063e+08 1.13288 ns
tpch_q18/parquet 212218201 1.91691e+08 1.10709 ns
tpch_q19/parquet 82682538 7.72551e+07 1.07025 ns
tpch_q20/parquet 106164905 9.79734e+07 1.08361 ns
tpch_q21/parquet 205196050 1.8714e+08 1.09648 ns
tpch_q22/parquet 57416159 5.10767e+07 1.12412 ns
tpch_q01/vortex-file-compressed 42453435 3.65589e+07 1.16123 ns
tpch_q02/vortex-file-compressed 67194698 5.43454e+07 1.23644 ns
tpch_q03/vortex-file-compressed 32878596 2.93032e+07 1.12201 ns
tpch_q04/vortex-file-compressed 19773157 1.95905e+07 1.00932 ns
tpch_q05/vortex-file-compressed 51450088 4.72886e+07 1.088 ns
tpch_q06/vortex-file-compressed 9815247 8.88372e+06 1.10486 ns
tpch_q07/vortex-file-compressed 80937910 7.03072e+07 1.1512 ns
tpch_q08/vortex-file-compressed 62749895 5.59756e+07 1.12102 ns
tpch_q09/vortex-file-compressed 81911711 7.12876e+07 1.14903 ns
tpch_q10/vortex-file-compressed 60987790 5.57993e+07 1.09299 ns
tpch_q11/vortex-file-compressed 28734335 2.7086e+07 1.06085 ns
tpch_q12/vortex-file-compressed 27871383 2.50713e+07 1.11169 ns
tpch_q13/vortex-file-compressed 28480173 2.65439e+07 1.07294 ns
tpch_q14/vortex-file-compressed 15476862 1.26833e+07 1.22026 ns
tpch_q15/vortex-file-compressed 30389598 3.03948e+07 0.999829 ns
tpch_q16/vortex-file-compressed 32230013 2.85899e+07 1.12732 ns
tpch_q17/vortex-file-compressed 65775460 5.69725e+07 1.15451 ns
tpch_q18/vortex-file-compressed 99878347 8.70109e+07 1.14788 ns
tpch_q19/vortex-file-compressed 35073361 3.35938e+07 1.04404 ns
tpch_q20/vortex-file-compressed 44593774 3.76189e+07 1.18541 ns
tpch_q21/vortex-file-compressed 107456763 9.00662e+07 1.19309 ns
tpch_q22/vortex-file-compressed 33479370 2.97752e+07 1.12441 ns

Copy link
Contributor

Benchmarks: TPC-H on S3

Table of Results
name PR 286dbe6 base 1d4601c ratio (PR/base) unit
tpch_q01/parquet 260180082 2.67897e+08 0.971196 ns
tpch_q02/parquet 702731474 6.58491e+08 1.06719 ns
tpch_q03/parquet 423096244 4.34568e+08 0.973602 ns
tpch_q04/parquet 222521193 2.31692e+08 0.960417 ns
tpch_q05/parquet 578214120 5.97015e+08 0.968508 ns
tpch_q06/parquet 178580113 1.79459e+08 0.995105 ns
tpch_q07/parquet 674132938 6.25665e+08 1.07747 ns
tpch_q08/parquet 800181400 7.97828e+08 1.00295 ns
tpch_q09/parquet 686418348 6.83645e+08 1.00406 ns
tpch_q10/parquet 535130007 5.34938e+08 1.00036 ns
tpch_q11/parquet 277742104 2.79742e+08 0.992852 ns
tpch_q12/parquet 267832456 2.80965e+08 0.953258 ns
tpch_q13/parquet 390332453 3.98157e+08 0.980349 ns
tpch_q14/parquet 250528326 2.4816e+08 1.00954 ns
tpch_q15/parquet 468183410 4.64718e+08 1.00746 ns
tpch_q16/parquet 257085089 2.56855e+08 1.0009 ns
tpch_q17/parquet 384216669 3.89876e+08 0.985483 ns
tpch_q18/parquet 532723676 5.37791e+08 0.990578 ns
tpch_q19/parquet 288672240 2.72457e+08 1.05951 ns
tpch_q20/parquet 513825391 5.16984e+08 0.99389 ns
tpch_q21/parquet 626440361 6.27328e+08 0.998585 ns
tpch_q22/parquet 272203230 2.6965e+08 1.00947 ns
tpch_q01/vortex-file-compressed 128712497 1.29747e+08 0.992029 ns
tpch_q02/vortex-file-compressed 325968265 3.16852e+08 1.02877 ns
tpch_q03/vortex-file-compressed 197829360 2.03446e+08 0.972393 ns
tpch_q04/vortex-file-compressed 127554220 1.21737e+08 1.04778 ns
tpch_q05/vortex-file-compressed 279214049 2.68332e+08 1.04055 ns
tpch_q06/vortex-file-compressed 101539851 1.013e+08 1.00236 ns
tpch_q07/vortex-file-compressed 312312736 3.08588e+08 1.01207 ns
tpch_q08/vortex-file-compressed 362101115 3.57737e+08 1.0122 ns
tpch_q09/vortex-file-compressed 383587147 3.73931e+08 1.02582 ns
tpch_q10/vortex-file-compressed 338491339 3.4362e+08 0.985074 ns
tpch_q11/vortex-file-compressed 130969125 1.31137e+08 0.998723 ns
tpch_q12/vortex-file-compressed 164970016 1.64537e+08 1.00263 ns
tpch_q13/vortex-file-compressed 207559430 2.03201e+08 1.02145 ns
tpch_q14/vortex-file-compressed 122114276 1.1884e+08 1.02755 ns
tpch_q15/vortex-file-compressed 280843483 2.72044e+08 1.03234 ns
tpch_q16/vortex-file-compressed 112621069 1.14108e+08 0.986971 ns
tpch_q17/vortex-file-compressed 163596724 1.6608e+08 0.985048 ns
tpch_q18/vortex-file-compressed 286129331 2.86445e+08 0.998896 ns
tpch_q19/vortex-file-compressed 159082216 1.61349e+08 0.985953 ns
tpch_q20/vortex-file-compressed 227793319 2.3078e+08 0.98706 ns
tpch_q21/vortex-file-compressed 341815670 3.43804e+08 0.994218 ns
tpch_q22/vortex-file-compressed 121466920 1.25305e+08 0.969371 ns

Copy link
Contributor

Benchmarks: Clickbench on NVME

Table of Results
name PR 286dbe6 base 1d4601c ratio (PR/base) unit
clickbench_q00/parquet 2424957 2.4475e+06 0.99079 ns
clickbench_q01/parquet 33867160 3.46957e+07 0.976121 ns
clickbench_q02/parquet 68540087 6.48193e+07 1.0574 ns
clickbench_q03/parquet 54627005 5.24202e+07 1.0421 ns
clickbench_q04/parquet 331088174 3.24959e+08 1.01886 ns
clickbench_q05/parquet 320883312 3.04326e+08 1.05441 ns
clickbench_q06/parquet 2444584 2.36494e+06 1.03368 ns
clickbench_q07/parquet 34293705 3.1823e+07 1.07764 ns
clickbench_q08/parquet 393561612 3.68273e+08 1.06867 ns
clickbench_q09/parquet 589349600 5.74948e+08 1.02505 ns
clickbench_q10/parquet 127771775 1.21985e+08 1.04744 ns
clickbench_q11/parquet 151135511 1.39227e+08 1.08553 ns
clickbench_q12/parquet 323476920 3.153e+08 1.02593 ns
clickbench_q13/parquet 501345323 4.79089e+08 1.04646 ns
clickbench_q14/parquet 332884028 3.17148e+08 1.04962 ns
clickbench_q15/parquet 372620970 3.64666e+08 1.02182 ns
clickbench_q16/parquet 776807767 7.75629e+08 1.00152 ns
clickbench_q17/parquet 690775888 6.75609e+08 1.02245 ns
clickbench_q18/parquet 1601047794 1.55056e+09 1.03256 ns
clickbench_q19/parquet 46202155 4.39959e+07 1.05015 ns
clickbench_q20/parquet 606614087 5.67664e+08 1.06862 ns
clickbench_q21/parquet 669622559 6.35849e+08 1.05312 ns
clickbench_q22/parquet 1011859769 9.54617e+08 1.05996 ns
clickbench_q23/parquet 3987947093 3.84049e+09 1.0384 ns
clickbench_q24/parquet 208913244 1.94961e+08 1.07157 ns
clickbench_q25/parquet 182768365 1.73131e+08 1.05566 ns
clickbench_q26/parquet 234794636 2.18783e+08 1.07319 ns
clickbench_q27/parquet 793920060 7.52811e+08 1.05461 ns
clickbench_q28/parquet 4865630381 4.39651e+09 1.1067 ns
clickbench_q29/parquet 281742982 2.46428e+08 1.14331 ns
clickbench_q30/parquet 342268530 3.21414e+08 1.06488 ns
clickbench_q31/parquet 380630694 3.6613e+08 1.0396 ns
clickbench_q32/parquet 1815000011 1.7455e+09 1.03981 ns
clickbench_q33/parquet 1520529307 1.52125e+09 0.999528 ns
clickbench_q34/parquet 1515202527 1.39884e+09 1.08318 ns
clickbench_q35/parquet 514852865 5.07772e+08 1.01394 ns
clickbench_q36/parquet 153225407 1.47047e+08 1.04202 ns
clickbench_q37/parquet 71565106 6.79063e+07 1.05388 ns
clickbench_q38/parquet 97741217 9.48596e+07 1.03038 ns
clickbench_q39/parquet 282414212 2.85222e+08 0.990158 ns
clickbench_q40/parquet 44900970 4.60253e+07 0.975572 ns
clickbench_q41/parquet 44862776 4.29143e+07 1.0454 ns
clickbench_q42/parquet 53398397 5.48163e+07 0.974134 ns
clickbench_q00/vortex-file-compressed 4928697 4.54046e+06 1.08551 ns
clickbench_q01/vortex-file-compressed 21201132 1.73913e+07 1.21907 ns
clickbench_q02/vortex-file-compressed 32998417 3.27198e+07 1.00851 ns
clickbench_q03/vortex-file-compressed 43618663 4.35458e+07 1.00167 ns
clickbench_q04/vortex-file-compressed 321519342 3.65829e+08 0.878878 ns
clickbench_q05/vortex-file-compressed 338841814 3.42901e+08 0.988161 ns
clickbench_q06/vortex-file-compressed 4984824 4.78094e+06 1.04264 ns
clickbench_q07/vortex-file-compressed 19405876 2.20856e+07 0.878668 ns
clickbench_q08/vortex-file-compressed 374455196 4.1145e+08 0.910088 ns
clickbench_q09/vortex-file-compressed 499152681 5.18011e+08 0.963595 ns
clickbench_q10/vortex-file-compressed 70427769 6.86134e+07 1.02644 ns
clickbench_q11/vortex-file-compressed 76900681 7.47246e+07 1.02912 ns
clickbench_q12/vortex-file-compressed 264177951 2.61648e+08 1.00967 ns
clickbench_q13/vortex-file-compressed 374975451 3.71196e+08 1.01018 ns
clickbench_q14/vortex-file-compressed 258014227 2.61165e+08 0.987937 ns
clickbench_q15/vortex-file-compressed 372078936 4.27958e+08 0.869429 ns
clickbench_q16/vortex-file-compressed 780018795 7.928e+08 0.983879 ns
clickbench_q17/vortex-file-compressed 766668806 7.75128e+08 0.989087 ns
clickbench_q18/vortex-file-compressed 1303389591 1.36426e+09 0.955383 ns
clickbench_q19/vortex-file-compressed 29881437 3.13018e+07 0.954624 ns
clickbench_q20/vortex-file-compressed 256504364 2.44472e+08 1.04922 ns
clickbench_q21/vortex-file-compressed 297174449 2.80966e+08 1.05769 ns
clickbench_q22/vortex-file-compressed 505925082 4.70912e+08 1.07435 ns
clickbench_q23/vortex-file-compressed 879486854 8.63905e+08 1.01804 ns
clickbench_q24/vortex-file-compressed 92947270 8.82528e+07 1.05319 ns
clickbench_q25/vortex-file-compressed 101706339 9.39327e+07 1.08276 ns
clickbench_q26/vortex-file-compressed 121846042 1.15662e+08 1.05347 ns
clickbench_q27/vortex-file-compressed 575637604 5.50341e+08 1.04596 ns
clickbench_q28/vortex-file-compressed 5592170259 5.21492e+09 1.07234 ns
clickbench_q29/vortex-file-compressed 253293091 2.44521e+08 1.03588 ns
clickbench_q30/vortex-file-compressed 222632054 2.27603e+08 0.978162 ns
clickbench_q31/vortex-file-compressed 242347521 2.39151e+08 1.01337 ns
clickbench_q32/vortex-file-compressed 1273851006 1.2876e+09 0.989326 ns
clickbench_q33/vortex-file-compressed 1221453746 1.24595e+09 0.980339 ns
clickbench_q34/vortex-file-compressed 1211769166 1.2681e+09 0.955579 ns
clickbench_q35/vortex-file-compressed 616598477 6.22941e+08 0.989818 ns
clickbench_q36/vortex-file-compressed 94028498 9.58048e+07 0.981459 ns
clickbench_q37/vortex-file-compressed 63359130 6.25973e+07 1.01217 ns
clickbench_q38/vortex-file-compressed 62565677 6.83135e+07 0.915862 ns
clickbench_q39/vortex-file-compressed 181183103 1.98858e+08 0.911116 ns
clickbench_q40/vortex-file-compressed 42915318 4.18324e+07 1.02589 ns
clickbench_q41/vortex-file-compressed 37791708 3.83719e+07 0.98488 ns
clickbench_q42/vortex-file-compressed 35050590 3.60736e+07 0.97164 ns

Copy link
Contributor

Benchmarks: compress

Table of Results
name PR 286dbe6 base 1d4601c ratio (PR/base) unit
compress time/taxi throughput 0.28178 0.292368 0.963785 bytes/ns
parquet_rs-zstd compress time/taxi throughput 0.335468 0.346964 0.966869 bytes/ns
decompress time/taxi throughput 2.10947 2.0337 1.03725 bytes/ns
parquet_rs-zstd decompress time/taxi throughput 1.81371 1.85726 0.976554 bytes/ns
compress time/AirlineSentiment throughput 0.00219159 0.00222919 0.98313 bytes/ns
parquet_rs-zstd compress time/AirlineSentiment throughput 0.0558462 0.0612379 0.911956 bytes/ns
decompress time/AirlineSentiment throughput 0.0134123 0.0140155 0.956962 bytes/ns
parquet_rs-zstd decompress time/AirlineSentiment throughput 0.0870284 0.0891011 0.976737 bytes/ns
compress time/Arade throughput 0.194452 0.206925 0.939722 bytes/ns
parquet_rs-zstd compress time/Arade throughput 0.467249 0.49474 0.944435 bytes/ns
decompress time/Arade throughput 2.08073 2.02761 1.0262 bytes/ns
parquet_rs-zstd decompress time/Arade throughput 1.91039 1.9625 0.973451 bytes/ns
compress time/Bimbo throughput 0.519338 0.56004 0.927324 bytes/ns
parquet_rs-zstd compress time/Bimbo throughput 0.409723 0.433855 0.944376 bytes/ns
decompress time/Bimbo throughput 2.45306 2.54518 0.963806 bytes/ns
parquet_rs-zstd decompress time/Bimbo throughput 3.03937 3.08241 0.986034 bytes/ns
compress time/CMSprovider throughput 0.0629401 0.0651354 0.966296 bytes/ns
parquet_rs-zstd compress time/CMSprovider throughput 0.419384 0.434246 0.965775 bytes/ns
decompress time/CMSprovider throughput 4.54594 4.6126 0.985548 bytes/ns
parquet_rs-zstd decompress time/CMSprovider throughput 2.06775 2.13084 0.970395 bytes/ns
compress time/Euro2016 throughput 0.17237 0.180372 0.955637 bytes/ns
parquet_rs-zstd compress time/Euro2016 throughput 0.339285 0.352161 0.963437 bytes/ns
decompress time/Euro2016 throughput 2.72804 2.77342 0.983638 bytes/ns
parquet_rs-zstd decompress time/Euro2016 throughput 1.13462 1.191 0.952662 bytes/ns
compress time/Food throughput 0.235451 0.253877 0.92742 bytes/ns
parquet_rs-zstd compress time/Food throughput 0.382536 0.40205 0.951464 bytes/ns
decompress time/Food throughput 6.05301 6.46334 0.936513 bytes/ns
parquet_rs-zstd decompress time/Food throughput 1.90202 1.96255 0.969159 bytes/ns
compress time/HashTags throughput 0.222861 0.230852 0.965385 bytes/ns
parquet_rs-zstd compress time/HashTags throughput 0.926483 0.964407 0.960676 bytes/ns
decompress time/HashTags throughput 6.44194 6.70708 0.960469 bytes/ns
parquet_rs-zstd decompress time/HashTags throughput 2.8217 3.19983 0.881828 bytes/ns
compress time/TPC-H l_comment chunked throughput 0.232407 0.245865 0.945261 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 0.304827 0.314999 0.967708 bytes/ns
decompress time/TPC-H l_comment chunked throughput 2.9857 3.13308 0.952959 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 1.38269 1.41849 0.974759 bytes/ns
compress time/TPC-H l_comment canonical throughput 0.0314379 0.0327047 0.961264 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 0.30332 0.313323 0.968075 bytes/ns
decompress time/TPC-H l_comment canonical throughput 3.01062 3.1011 0.970822 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 1.36294 1.41211 0.965181 bytes/ns
compress time/wide table cols=10 chunks=1 rows=1000 throughput 0.168106 0.173058 0.971383 bytes/ns
parquet_rs-zstd compress time/wide table cols=10 chunks=1 rows=1000 throughput 0.239801 0.247077 0.970551 bytes/ns
decompress time/wide table cols=10 chunks=1 rows=1000 throughput 0.804608 0.857311 0.938525 bytes/ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=1 rows=1000 throughput 0.51221 0.530676 0.965203 bytes/ns
compress time/wide table cols=100 chunks=1 rows=1000 throughput 0.16115 0.16552 0.973598 bytes/ns
parquet_rs-zstd compress time/wide table cols=100 chunks=1 rows=1000 throughput 0.220805 0.234406 0.941975 bytes/ns
decompress time/wide table cols=100 chunks=1 rows=1000 throughput 1.21307 1.13876 1.06526 bytes/ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=1 rows=1000 throughput 0.525495 0.53153 0.988647 bytes/ns
compress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.145649 0.149876 0.971797 bytes/ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.199375 0.206184 0.966979 bytes/ns
decompress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.781266 0.867939 0.90014 bytes/ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=1 rows=1000 throughput 0.446394 0.519332 0.859554 bytes/ns
compress time/wide table cols=10 chunks=50 rows=1000 throughput 0.0873149 0.0891065 0.979894 bytes/ns
parquet_rs-zstd compress time/wide table cols=10 chunks=50 rows=1000 throughput 0.164325 0.169288 0.970685 bytes/ns
decompress time/wide table cols=10 chunks=50 rows=1000 throughput 0.728417 0.807053 0.902564 bytes/ns
parquet_rs-zstd decompress time/wide table cols=10 chunks=50 rows=1000 throughput 0.522677 0.545118 0.958833 bytes/ns
compress time/wide table cols=100 chunks=50 rows=1000 throughput 0.0719618 0.0828539 0.868538 bytes/ns
parquet_rs-zstd compress time/wide table cols=100 chunks=50 rows=1000 throughput 0.140186 0.155397 0.902112 bytes/ns
decompress time/wide table cols=100 chunks=50 rows=1000 throughput 1.19344 1.18808 1.00451 bytes/ns
parquet_rs-zstd decompress time/wide table cols=100 chunks=50 rows=1000 throughput 0.526108 0.547129 0.96158 bytes/ns
compress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.0638445 0.0668029 0.955715 bytes/ns
parquet_rs-zstd compress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.102477 0.112099 0.91417 bytes/ns
decompress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.793578 0.866856 0.915467 bytes/ns
parquet_rs-zstd decompress time/wide table cols=1000 chunks=50 rows=1000 throughput 0.422547 0.488913 0.864259 bytes/ns
vortex:raw size/taxi 0.119081 0.119081 1 ratio
vortex size/taxi 5.89219e+07 5.89219e+07 1 bytes
vortex:parquet-zstd size/taxi 1.05291 1.05291 1 ratio
vortex:raw size/AirlineSentiment 1.34966 1.34966 1 ratio
vortex size/AirlineSentiment 4408 4408 1 bytes
vortex:parquet-zstd size/AirlineSentiment 4.55843 4.55843 1 ratio
vortex:raw size/Arade 0.255616 0.255616 1 ratio
vortex size/Arade 3.03335e+08 3.03335e+08 1 bytes
vortex:parquet-zstd size/Arade 0.993262 0.993262 1 ratio
vortex:raw size/Bimbo 0.116509 0.116509 1 ratio
vortex size/Bimbo 8.32939e+08 8.32939e+08 1 bytes
vortex:parquet-zstd size/Bimbo 2.14592 2.14592 1 ratio
vortex:raw size/CMSprovider 0.184507 0.184507 1 ratio
vortex size/CMSprovider 1.15965e+09 1.15965e+09 1 bytes
vortex:parquet-zstd size/CMSprovider 1.50697 1.50697 1 ratio
vortex:raw size/Euro2016 0.399036 0.399036 1 ratio
vortex size/Euro2016 1.81571e+08 1.81571e+08 1 bytes
vortex:parquet-zstd size/Euro2016 1.52722 1.52722 1 ratio
vortex:raw size/Food 0.177001 0.177001 1 ratio
vortex size/Food 5.96217e+07 5.96217e+07 1 bytes
vortex:parquet-zstd size/Food 1.64564 1.64564 1 ratio
vortex:raw size/HashTags 0.139622 0.139622 1 ratio
vortex size/HashTags 2.67505e+08 2.67505e+08 1 bytes
vortex:parquet-zstd size/HashTags 1.99682 1.99682 1 ratio
vortex:raw size/TPC-H l_comment chunked 0.418493 0.418039 1.00109 ratio
vortex size/TPC-H l_comment chunked 1.04287e+08 1.04174e+08 1.00109 bytes
vortex:parquet-zstd size/TPC-H l_comment chunked 1.8316 1.82962 1.00108 ratio
vortex:raw size/TPC-H l_comment canonical 0.425059 0.425026 1.00008 ratio
vortex size/TPC-H l_comment canonical 1.05921e+08 1.05913e+08 1.00008 bytes
vortex:parquet-zstd size/TPC-H l_comment canonical 1.86033 1.86032 1.00001 ratio
vortex:raw size/wide table cols=10 chunks=1 rows=1000 0.625225 0.625225 1 ratio
vortex size/wide table cols=10 chunks=1 rows=1000 100096 100096 1 bytes
vortex:parquet-zstd size/wide table cols=10 chunks=1 rows=1000 1.07073 1.07073 1 ratio
vortex:raw size/wide table cols=100 chunks=1 rows=1000 0.622268 0.622268 1 ratio
vortex size/wide table cols=100 chunks=1 rows=1000 996136 996136 1 bytes
vortex:parquet-zstd size/wide table cols=100 chunks=1 rows=1000 1.06561 1.06561 1 ratio
vortex:raw size/wide table cols=1000 chunks=1 rows=1000 0.621972 0.621972 1 ratio
vortex size/wide table cols=1000 chunks=1 rows=1000 9.95654e+06 9.95654e+06 1 bytes
vortex:parquet-zstd size/wide table cols=1000 chunks=1 rows=1000 1.0651 1.0651 1 ratio
vortex:raw size/wide table cols=10 chunks=50 rows=1000 0.608827 0.608827 1 ratio
vortex size/wide table cols=10 chunks=50 rows=1000 100096 100096 1 bytes
vortex:parquet-zstd size/wide table cols=10 chunks=50 rows=1000 1.07073 1.07073 1 ratio
vortex:raw size/wide table cols=100 chunks=50 rows=1000 0.607249 0.607249 1 ratio
vortex size/wide table cols=100 chunks=50 rows=1000 996136 996136 1 bytes
vortex:parquet-zstd size/wide table cols=100 chunks=50 rows=1000 1.06561 1.06561 1 ratio
vortex:raw size/wide table cols=1000 chunks=50 rows=1000 0.607091 0.607091 1 ratio
vortex size/wide table cols=1000 chunks=50 rows=1000 9.95654e+06 9.95654e+06 1 bytes
vortex:parquet-zstd size/wide table cols=1000 chunks=50 rows=1000 1.0651 1.0651 1 ratio

@a10y a10y changed the title throw error if order is not preserved fix: use new Compressor::rebuild_from for FSST pushdown Mar 17, 2025
@a10y a10y marked this pull request as ready for review March 17, 2025 17:43
@a10y a10y requested a review from robert3005 March 17, 2025 17:43
Copy link

codspeed-hq bot commented Mar 17, 2025

CodSpeed Performance Report

Merging #2708 will improve performances by ×2.6

Comparing aduffy/assert-fsst-cmp (f4c6b93) with develop (a0da90b)

Summary

⚡ 12 improvements
✅ 763 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
pushdown_compare[(1000, 16, 4)] 878.6 µs 345.1 µs ×2.5
pushdown_compare[(1000, 16, 8)] 908.1 µs 367.5 µs ×2.5
pushdown_compare[(1000, 4, 4)] 880.4 µs 346.6 µs ×2.5
pushdown_compare[(1000, 4, 8)] 909.3 µs 347.6 µs ×2.6
pushdown_compare[(1000, 64, 4)] 877.8 µs 344.6 µs ×2.5
pushdown_compare[(1000, 64, 8)] 901.4 µs 354.2 µs ×2.5
pushdown_compare[(10000, 16, 4)] 940.9 µs 407.3 µs ×2.3
pushdown_compare[(10000, 16, 8)] 1,001.3 µs 454.3 µs ×2.2
pushdown_compare[(10000, 4, 4)] 940.6 µs 406.9 µs ×2.3
pushdown_compare[(10000, 4, 8)] 963.3 µs 414.5 µs ×2.3
pushdown_compare[(10000, 64, 4)] 939.3 µs 406.3 µs ×2.3
pushdown_compare[(10000, 64, 8)] 963.2 µs 415.8 µs ×2.3

@a10y a10y enabled auto-merge (squash) March 17, 2025 17:52
@a10y a10y merged commit a873885 into develop Mar 17, 2025
27 checks passed
@a10y a10y deleted the aduffy/assert-fsst-cmp branch March 17, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants