Skip to content

Conversation

@michaellee1019
Copy link
Member

@michaellee1019 michaellee1019 commented Oct 31, 2025

My last PR introduced data racing issue in tests that wasn't seen during development. This PR suppresses the status spinners while tests are running. There are separate tests that exist in progress_manager_test.go to test the spinner functionality, so I think its fine to not test as the higher level.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Oct 31, 2025
pm.currentSpinner.Success(" " + prefix + message + elapsed)
pm.currentSpinner = nil
// Give the spinner goroutine time to finish to avoid race conditions
time.Sleep(10 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what the root cause is? I feel this isn't a fix but rather just a guess at making something less likely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I did not mean to commit this. It was troubleshooting originally. I'll remove this and the other two below.

Copy link
Member

@stuqdog stuqdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable to me, modulo Dan's comment.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 3, 2025
map[string]any{
moduleFlagPath: manifestPath, generalFlagPartID: "part-123",
moduleBuildFlagNoBuild: true, moduleFlagLocal: true,
generalFlagNoProgress: true, // Disable progress spinner to avoid race conditions in tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the progress spinner is considered harmful. Did we identify what the race is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed that part of your comment earlier! It looks like the race condition is internal to the pterm library, specifically in the Success() call. I tried adding a mutex wrapper around all calls to the library, for example the following method. It still resulted in a data race:

func (s *synchronizedSpinner) Success(message ...any) {
	s.mu.Lock()
	defer s.mu.Unlock()
	if s.spinner == nil {
		return
	}
	s.spinner.Success(message...)
}

I found an open GH issue related to this the problem pterm/pterm#482.

When running the CLI outside of tests, there isn't any issues (I tested it a lot over the past 2 weeks) while running viam module reload.

Will think about it. There isn't really good libs for this in golang.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, all cleaned up and made sure tests passed in --race mode.

Copy link
Member

@dgottlieb dgottlieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bought in on the approach. Now I'm just curious why there's a bunch of other code changes as part of the PR.

t.Run("addsServiceWhenMissing", func(t *testing.T) {
part, _ := vc.getRobotPart("id")
_, err := addShellService(cCtx, vc, logging.NewTestLogger(t), part.Part, false)
// Create isolated setup for this subtest to avoid shared state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that the underlying library is the cause and that disabling it serves our goals here.

What's with these other changes? Were there other races not reported by the failure I saw that this fixes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was too aggressive with trying out different mechanisms to avoid the race. My bad, I will cleanup my PR and delete everything except setting the --no-progress flag.

)

err = reloadModuleActionInner(cCtx, vc, parseStructFromCtx[reloadModuleArgs](cCtx), logger, false)
// Create isolated logger for this subtest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is done elsewhere -- but do we know what the consequence is of not creating a separate logger? And just using the top-level test logger?

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 4, 2025
@dgottlieb dgottlieb merged commit 8119fc9 into viamrobotics:main Nov 5, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test This pull request is marked safe to test from a trusted zone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants