parallelize mesh fixup #1148

pca006132 · 2025-02-19T06:31:35Z

SplitPinchedVerts now runs in parallel.
Edge deduplication now avoids sorting, so it is much faster now.

Previously:

nTri = 512, time = 0.00387912 sec
nTri = 2048, time = 0.00780949 sec
nTri = 8192, time = 0.00926127 sec
nTri = 32768, time = 0.0170758 sec
nTri = 131072, time = 0.0393845 sec
nTri = 524288, time = 0.135483 sec
nTri = 2097152, time = 0.499138 sec
nTri = 8388608, time = 2.12073 sec

Now:

nTri = 512, time = 0.00200502 sec
nTri = 2048, time = 0.0026148 sec
nTri = 8192, time = 0.020302 sec
nTri = 32768, time = 0.0357589 sec
nTri = 131072, time = 0.0326722 sec
nTri = 524288, time = 0.116826 sec
nTri = 2097152, time = 0.445966 sec
nTri = 8388608, time = 1.86558 sec

About 10% improvement for larger meshes.

codecov · 2025-02-19T06:41:25Z

Codecov Report

Attention: Patch coverage is 98.18182% with 1 line in your changes missing coverage. Please review.

Project coverage is 91.70%. Comparing base (c16b521) to head (0369548).
Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
src/edge_op.cpp	97.77%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1148      +/-   ##
==========================================
+ Coverage   91.68%   91.70%   +0.01%     
==========================================
  Files          30       30              
  Lines        5951     5964      +13     
==========================================
+ Hits         5456     5469      +13     
  Misses        495      495

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pca006132 · 2025-02-19T07:49:39Z

src/impl.cpp

@@ -223,14 +223,17 @@ void Manifold::Impl::CreateFaces() {
  stable_sort(triPriority.begin(), triPriority.end(),
              [](auto a, auto b) { return a.area2 > b.area2; });

+  Vec<int> interiorHalfedges;


It turns out the reallocation is quite significant. I wish the profiler can just give warning when there are many allocator calls within a certain period of time...

pca006132 · 2025-02-19T07:53:15Z

will try to figure out wtf is gcc complaining about later...

elalish · 2025-02-19T09:24:13Z

src/edge_op.cpp

+            auto expected = std::numeric_limits<size_t>::max();
+            if (!reinterpret_cast<std::atomic<size_t>*>(largestEdge.data() +
+                                                        vert)
+                     ->compare_exchange_strong(expected, largest) &&


You can just reinterpret a random vector element as an atomic? I guess I'm used to the atomic free functions of CUDA - is this a more idiomatic approach?

This is not really idiomatic, but it should work. I need to think about the best way to do this.

src/edge_op.cpp

elalish · 2025-02-19T09:33:26Z

src/edge_op.cpp

+                } else {
+                  endVerts.push_back(halfedge_[current].endVert);
+                  // switch to hashset for vertices with many neighbors
+                  if (endVerts.size() > 32) {


6 is normal - 10 is already pretty rare. Still, I suppose it's really a perf cross-over.

Yeah, this is just some random integer that I think should work.

src/edge_op.cpp

elalish · 2025-02-19T09:38:11Z

src/edge_op.cpp

@@ -204,66 +206,94 @@ void Manifold::Impl::CleanupTopology() {
  // verts. They must be removed before edge collapse.
  SplitPinchedVerts();

-  Vec<int> entries;
-  FlagStore s;


With your algorithm update, does it still need the while loop? I didn't love having to add that - felt like a cop-out to my algorithm not doing a good enough first pass.

The algorithm is nearly the same as the old one, just parallelized, so I think the loop is still needed.

elalish

Looks great, thanks!

parallelize mesh fixup

f30cfba

pca006132 requested a review from elalish February 19, 2025 06:31

pca006132 added 4 commits February 19, 2025 14:47

avoid quadratic behavior

e9530a2

documentation

ad22a5d

allocation optimization

4269be7

format

9357dc2

pca006132 commented Feb 19, 2025

View reviewed changes

fix gcc warning

c565270

pca006132 mentioned this pull request Feb 19, 2025

Improve performance of MeshGL -> Manifold construction #1138

Open

reduce temporary allocations in std::function

16c922b

elalish approved these changes Feb 19, 2025

View reviewed changes

pca006132 added 5 commits February 19, 2025 18:52

rename

f290d93

format

5d8004b

split DedupeEdges into its own function

f3ac29b

avoid undefined behavior

24b04d1

Merge branch 'master' into faster-fixup

0369548

elalish approved these changes Feb 20, 2025

View reviewed changes

pca006132 merged commit f178cd1 into master Feb 21, 2025
27 checks passed

pca006132 deleted the faster-fixup branch February 21, 2025 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelize mesh fixup #1148

parallelize mesh fixup #1148

pca006132 commented Feb 19, 2025

codecov bot commented Feb 19, 2025 •

edited

Loading

pca006132 Feb 19, 2025

pca006132 commented Feb 19, 2025

elalish Feb 19, 2025

pca006132 Feb 19, 2025

elalish Feb 19, 2025

pca006132 Feb 19, 2025

elalish Feb 19, 2025

pca006132 Feb 19, 2025

elalish left a comment

parallelize mesh fixup #1148

parallelize mesh fixup #1148

Conversation

pca006132 commented Feb 19, 2025

codecov bot commented Feb 19, 2025 • edited Loading

Codecov Report

pca006132 Feb 19, 2025

Choose a reason for hiding this comment

pca006132 commented Feb 19, 2025

elalish Feb 19, 2025

Choose a reason for hiding this comment

pca006132 Feb 19, 2025

Choose a reason for hiding this comment

elalish Feb 19, 2025

Choose a reason for hiding this comment

pca006132 Feb 19, 2025

Choose a reason for hiding this comment

elalish Feb 19, 2025

Choose a reason for hiding this comment

pca006132 Feb 19, 2025

Choose a reason for hiding this comment

elalish left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 19, 2025 •

edited

Loading