RSDK-12719: Lower the number of arm positions we cache. #5518

dgottlieb · 2025-11-21T15:11:04Z

Starting a conversation via a PR. Apologies if this isn't the best place.

We got a report of crashing once upgrading v101/v102. This was a pretty clear OOM on the first access of motion planning on a lite6 arm (read: building the cache). This was on one of the nanos I think? With 8GB of RAM that's shared with the GPU. I believe a CV model was also loaded.

The reason for this increased consumption was the finer grained jog creating more cache entries. Here are some numbers for memory used by the cache (after forcing a GC) across different versions (Alloc being actively used memory):

v0.102.0
Alloc = 2861 MiB	TotalAlloc = 10051 MiB	Sys = 5543 MiB	NumGC = 23

v0.101.0
Alloc = 2379 MiB	TotalAlloc = 7350 MiB	Sys = 4714 MiB	NumGC = 20

v0.100.0
Alloc = 253 MiB	TotalAlloc = 1756 MiB	Sys = 619 MiB	NumGC = 19

On main which removed the unused pose memory was cut by ~35% from v101/v102. But still greatly increased from v100:

Main:
Alloc = 1648 MiB	TotalAlloc = 5170 MiB	Sys = 3785 MiB	NumGC = 13

This patch brings memory back in line with the v100:

arm6JogRatios   = []float64{90, 32, 8, 8, 4, 2}
Alloc = 227 MiB	TotalAlloc = 1376 MiB	Sys = 562 MiB	NumGC = 15

dgottlieb · 2025-11-21T15:12:43Z

motionplan/armplanning/smart_seed.go


 var (
-	arm6JogRatios   = []float64{360, 32, 8, 8, 4, 2}
+	arm6JogRatios   = []float64{90, 16, 8, 8, 4, 2}


An "enhanced" version of this patch is to query how much memory the machine has. Such that we're more conservative with low-memory rpis while getting better precision with bigger boxes.

Happy move this PR in that direction. Or any other direction. Just wanted to get product feedback/a path forward for users unable to upgrade.

Who can't upgrade?
This will make a lot of things worse and isn't the direction I think we should go.
We can just not cache at all if the system has less than x amount of ram perhaps

Nick H here: https://viaminc.slack.com/archives/C091QGG4MQ8/p1763500966092209

lame - is it easy enough to see available memory?

github-actions · 2025-11-21T15:20:26Z

Availability

Scene #	viamrobotics:main	dgottlieb:RSDK-12719	Percent Improvement	Health
1	100%	100%	0%	➖
2	100%	100%	0%	➖
3	100%	100%	0%	➖
4	100%	100%	0%	➖
5	100%	100%	0%	➖
6	100%	100%	0%	➖
7	100%	100%	0%	➖
8	100%	100%	0%	➖
9	100%	100%	0%	➖
10	100%	100%	0%	➖

Quality

Scene #	viamrobotics:main	dgottlieb:RSDK-12719	Percent Improvement	Probability of Improvement	Health
1	1.31±0.00	1.31±0.00	0%	63%	➖
2	0.90±0.00	0.90±0.00	-0%	50%	➖
3	5.86±1.18	5.14±0.00	12%	73%	➖
4	3.13±0.40	3.23±0.42	-3%	43%	➖
5	10.26±3.30	9.47±3.34	8%	57%	➖
6	9.67±3.46	10.51±4.32	-9%	44%	➖
7	6.71±2.36	4.89±2.29	27%	71%	➖
8	0.90±0.00	0.90±0.00	-0%	50%	➖
9	4.15±0.00	4.20±0.14	-1%	37%	➖
10	12.84±0.41	12.84±0.41	-0%	50%	➖

Performance

Scene #	viamrobotics:main	dgottlieb:RSDK-12719	Percent Improvement	Probability of Improvement	Health
1	0.03±0.01	0.03±0.01	3%	54%	➖
2	0.04±0.01	0.03±0.01	20%	82%	➖
3	0.09±0.03	0.03±0.00	67%	97%	✅
4	1.19±0.04	1.17±0.05	2%	62%	➖
5	1.70±0.38	1.62±0.35	5%	56%	➖
6	1.81±0.57	1.93±0.61	-7%	44%	➖
7	2.18±0.75	2.07±0.75	5%	54%	➖
8	0.03±0.00	0.03±0.00	11%	78%	➖
9	2.04±0.14	1.94±0.07	5%	73%	➖
10	3.36±0.54	3.34±0.56	0%	51%	➖

The above data was generated by running scenes defined in the motion-testing repository
The SHA1 for viamrobotics:main is: b39abd91b2d7c568703e39f08cf352167db9a609
The SHA1 for dgottlieb:RSDK-12719 is: b39abd91b2d7c568703e39f08cf352167db9a609

10 samples were taken for each scene

RSDK-12719: Lower the number of arm positions we cache.

787d0de

viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 21, 2025

dgottlieb commented Nov 21, 2025

View reviewed changes

dgottlieb requested a review from erh November 21, 2025 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RSDK-12719: Lower the number of arm positions we cache. #5518

RSDK-12719: Lower the number of arm positions we cache. #5518

dgottlieb commented Nov 21, 2025 •

edited

Loading

Uh oh!

dgottlieb Nov 21, 2025

Uh oh!

erh Nov 23, 2025

Uh oh!

dgottlieb Nov 23, 2025

Uh oh!

erh Nov 23, 2025

Uh oh!

github-actions bot commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RSDK-12719: Lower the number of arm positions we cache. #5518

Are you sure you want to change the base?

RSDK-12719: Lower the number of arm positions we cache. #5518

Conversation

dgottlieb commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgottlieb Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

erh Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

dgottlieb Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

erh Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 21, 2025

Availability

Quality

Performance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dgottlieb commented Nov 21, 2025 •

edited

Loading