redefining for loop variable semantics #56010

rsc · 2022-10-03T14:19:32Z

rsc
Oct 3, 2022
Maintainer

Update 2023-06-06: Go 1.21 is expected to support GOEXPERIMENT=loopvar as a way of trying out these new semantics. See #57969 and https://go.dev/wiki/LoopvarExperiment.

We have been looking at what to do about the for loop variable problem (#20733), gathering data about what a change would mean and how we might deploy it. This discussion aims to gather early feedback about this idea, to understand concerns and aspects we have not yet considered. Thanks for keeping this discussion respectful and productive!

To recap #20733 briefly, the problem is that loops like this one don’t do what they look like they do:

var all []*Item
for _, item := range items {
	all = append(all, &item)
}

That is, this code has a bug. After this loop executes, all contains len(items) identical pointers, each pointing at the same Item, holding the last value iterated over. This happens because the item variable is per-loop, not per-iteration: &item is the same on every iteration, and item is overwritten on each iteration. The usual fix is to write this instead:

var all []*Item
for _, item := range items {
	item := item
	all = append(all, &item)
}

This bug also often happens in code with closures that capture the address of item implicitly, like:

var prints []func()
for _, v := range []int{1, 2, 3} {
	prints = append(prints, func() { fmt.Println(v) })
}
for _, print := range prints {
	print()
}

This code prints 3, 3, 3, because all the closures print the same v, and at the end of the loop, v is set to 3. Note that there is no explicit &v to signal a potential problem. Again the fix is the same: add v := v.

Goroutines are also often involved, although as these examples show, they need not be. See also the Go FAQ entry.

We have talked for a long time about redefining these semantics, to make loop variables per-iteration instead of per-loop. That is, the change would effectively be to add an implicit “x := x” at the start of every loop body for each iteration variable x, just like people do manually today. Making this change would remove the bugs from the programs above.

In the Go 2 transitions document we gave the general rule that language redefinitions like what I just described are not permitted. I believe that is the right general rule, but I have come to also believe that the for loop variable case is strong enough to motivate a one-time exception to that rule. Loop variables being per-loop instead of per-iteration is the only design decision I know of in Go that makes programs incorrect more often than it makes them correct. Since it is the only such design decision, I do not see any plausible candidates for additional exceptions.

To make the breakage completely user controlled, the way the rollout would work is to change the semantics based on the go line in each package’s go.mod file, the same line we already use for enabling language features (you can only use generics in packages whose go.mod says “go 1.18” or later). Just this once, we would use the line for changing semantics instead of for adding a feature or removing a feature.

If we hypothetically made the change in go 1.30, then modules that say “go 1.30” or later get the per-iteration variables, while modules with earlier versions get the per-loop variables:

In a given code base, the change would be “gradual” in the sense that each module can update to the new semantics independently, avoiding a bifurcation of the ecosystem.

The specific semantics of the redefinition would be that both range loops and three-clause for loops get per-iteration variables. So in addition to the program above being fixed, this one would be fixed too:

var prints []func()
for i := 1; i <= 3; i++ {
	prints = append(prints, func() { fmt.Println(i) })
}
for _, print := range prints {
	print()
}

In the 3-clause form, the start of the iteration body copies the per-loop i into a per-iteration i, and then the end of the body (or any continue statement) copies the current value of the per-iteration i back to the per-loop i. Unless a variable is captured like in the above example, nothing changes about how the loop executes.

Adjusting the 3-clause form may seem strange to C programmers, but the same capture problems that happen in range loops also happen in three-clause for loops. Changing both forms eliminates that bug from the entire language, not just one place, and it keeps the loops consistent in their variable semantics. That consistency means that if you change a loop from using range to using a 3-clause form or vice versa, you only have to think about whether the iteration visits the same items, not whether a subtle change in variable semantics will break your code. It is also worth noting that JavaScript is using per-iteration semantics for 3-clause for loops using let, with no problems.

I think the semantics are a smaller issue than the idea of making this one-time gradual breaking change. I’ve posted this discussion to gather early feedback on the idea of making a change here at all, because that’s something we’ve previously treated as off the table.

I’ve outlined the reasons I believe this case merits an exception below. I’m hoping this discussion can surface concerns, good ideas, and other feedback about the idea of making the change at all (not as much the semantics).

I know that C# 5 made this change as well, but I’ve been unable to find any retrospectives about how it was rolled out or how it went. If anyone knows more about how the C# transition went or has links to that information, please post that too. Thanks!

The case for making the change:

A decade of experience shows the cost of the current semantics

I talked at Gophercon once about how we need agreement about the existence of a problem before we move on to solutions. When we examined this issue in the run up to Go 1, it did not seem like enough of a problem. The general consensus was that it was annoying but not worth changing.

Since then, I suspect every Go programmer in the world has made this mistake in one program or another. I certainly have done it repeatedly over the past decade, despite being the one who argued for the current semantics and then implemented them. (Sorry!)

The current cures for this problem are worse than the disease.

I ran a program to process the git logs of the top 14k modules, from about 12k git repos and looked for commits with diff hunks that were entirely “x := x” lines being added. I found about 600 such commits. On close inspection, approximately half of the changes were unnecessary, done probably either at the insistence of inaccurate static analysis, confusion about the semantics, or an abundance of caution. Perhaps the most striking was this pair of changes from different projects:

 	for _, informer := range c.informerMap {
+		informer := informer
 		go informer.Run(stopCh)
 	}

 	for _, a := range alarms {
+		a := a
 		go a.Monitor(b)
 	}

One of these two changes is unnecessary and the other is a real bug fix, but you can’t tell which is which without more context. (In one, the loop variable is an interface value, and copying it has no effect; in the other, the loop variable is a struct, and the method takes a pointer receiver, so copying it ensures that the receiver is a different pointer on each iteration.)

And then there are changes like this one, which is unnecessary regardless of context (there is no opportunity for hidden address-taking):

 	for _, scheme := range artifact.Schemes {
+		scheme := scheme
 		Runtime.artifactByScheme[scheme.ID] = id
 		Runtime.schemesByID[scheme.ID] = scheme
 	}

This kind of confusion and ambiguity is the exact opposite of the readability we are aiming for in Go.

People are clearly having enough trouble with the current semantics that they choose overly conservative tools and adding “x := x” lines by rote in situations not flagged by tools, preferring that to debugging actual problems. This is an entirely rational choice, but it is also an indictment of the current semantics.

We’ve also seen production problems caused in part by these semantics, both inside Google and at other companies (for example, this problem at Let’s Encrypt). It seems likely to me that, world-wide, the current semantics have easily cost many millions of dollars in wasted developer time and production outages.

Old code is unaffected, compiling exactly as before

The go lines in go.mod give us a way to guarantee that all old code is unaffected, even in a build that also contains new code. Only when you change your go.mod line do the packages in that module get the new semantics, and you control that. In general this one reason is not sufficient, as laid out in the Go 2 transitions document. But it is a key property that contributes to the overall rationale, with all the other reasons added in.

Changing the semantics is usually a no-op, and when it’s not, it fixes buggy code far more often than it breaks correct code

We built a toolchain with the change and tested a subset of Google’s Go tests and analyzed the resulting failures. The rate of new test failures was approximately 1 in 2,000, but nearly all were previously undiagnosed actual bugs. The rate of spurious test failures (correct code actually broken by the change) was 1 in 50,000.

To start, there were only 58 failures out of approximately 100,000 tests executed, covering approximately 1.3M for loops. Of the failures, 36 (62%) were tests not testing what they looked like they tested because of bad interactions with t.Parallel: the new semantics made the tests actually run correctly, and then the tests failed because they found actual latent bugs in the code under test. The next most common mistake was appending &v on each iteration to a slice, which makes a slice of N identical pointers. The rest were other kinds of bugs canceling out to make tests pass incorrectly. We found only 2 instances out of the 58 where code correctly depended on per-loop semantics and was actually broken by the change. One involved a handler registered using once.Do that needed access to the current iteration’s values on each invocation. The other involved low-level code running in a context when allocation is disallowed, and the variable escaped the loop (but not the function), so that the old semantics did not allocate while the new semantics did. Both were easily adjusted.

Of course, there is always the possibility that Google’s tests may not be representative of the overall ecosystem’s tests in various ways, and perhaps this is one of them. But there is no indication from this analysis of any common idiom at all where per-loop semantics are required. The git log analysis points in the same direction: parts of the ecosystem are adopting tools with very high false positive rates and doing what the tools say, with no apparent problems.

There is also the possibility that while there’s no semantic change, existing loops would, when updated to the new Go version, allocate one variable per iteration instead of once per loop. This problem would show up in memory profiles and is far easier to track down than the silent corruption we get when things go wrong with today’s semantics. Benchmarking of the public “bent” bench suite showed no statistically significant performance difference over all, so we expect most programs to be unaffected.

Good tooling can help users identify exactly the loops that need the most scrutiny during the transition

Our experience analyzing the failures in Google’s Go tests shows that we can use compiler instrumentation (adjusted -m output) to identify loops that may be compiling differently, because the compiler thinks the loop variables escape. Almost all the time, this identifies a very small number of loops, and one of those loops is right next to the failure. That experience can be wrapped up into a good tool for directing any debugging sessions.

Another possibility is a compilation mode where the compiled code consults an array of bits to decide during execution whether each loop gets old or new semantics. Package testing could provide a mode that implements binary search on that array to identify exactly which loops cause a test to fail. So if a test fails, you run the “loop finding mode” and then it tells you: “applying the semantic change to these specific loops causes the failure”. All the others are fine.

Static analysis is not a viable alternative

Whether a particular loop is “buggy” due to the current behavior depends on whether the address of an iteration value is taken and then that pointer is used after the next iteration begins. It is impossible in general for analyzers to see where the pointer lands and what will happen to it. In particular, analyzers cannot see clearly through interface method calls or indirect function calls. Different tools have made different approximations. Vet recognizes a few definitely bad patterns, and we are adding a new one checking for mistakes using t.Parallel in Go 1.20. To avoid false positives, it also has many false negatives. Other checkers in the ecosystem err in the other direction. The commit log analysis showed some checkers were producing over 90% false positive rates in real code bases. (That is, when the checker was added to the code base, the “corrections” submitted at the same time were not fixing actual problems over 90% of the time in some commits.)

There is no perfect way to catch these bugs statically. Changing the semantics, on the other hand, eliminates them all.

Changing loop syntax entirely would cause unnecessary churn

We have talked in the past about introducing a different syntax for loops (for example, #24282), and then giving the new syntax the new semantics while deprecating the current syntax. Ultimately this would cause a very significant amount of churn disproportionate to the benefit: the vast majority of existing loops are correct and do not need any fixes. It seems like an extreme response to force an edit of every for loop that exists today while invalidating all existing documentation and then having two different for loops that Go programmers need to understand, especially compared to changing the semantics to match what people overwhelmingly expect when they write the code.

My goal for this discussion is to gather early feedback on the idea of making a change here at all, because that’s something we’ve previously treated as off the table, as well as any feedback on expected impact and what would help users most in a roll-out strategy. Thanks!

jaredpar · 2022-10-03T15:34:47Z

jaredpar
Oct 3, 2022

I know that C# 5 made this change as well, but I’ve been unable to find any retrospectives about how it was rolled out or how it went. If anyone knows more about how the C# transition went or has links to that information, please post that too. Thanks!

I work on the C# team and can offer perspective here.

The C# 5 rollout unconditionally changed the foreach loop variable to be per iteration. At the time of C# 5 there was no equivalent to Go's putting go 1.30 in a go.mod file so the only choice was break unconditionally or live with the behavior. The loop variable lifetime became a bit of a sore point pretty much the moment the language introduced lambda expressions for all the reasons you describe. As the language grew to leverage lambdas more through features like LINQ, async, Task Parallel Libraries, etc ... the problem got worse. It got so prevalent that the C# team decided the unconditional break was justified. It would be much easier to explain the change to the, hopefully, small number of customers impacted vs. continuing to explain the tricksy behavior to new customers.

This change was not taken lightly. It had been discussed internally for several years, blogs were written about it, lots of analysis of customer code, upper management buy off, etc ... In end though the change was rather anticlimactic. Yes it did break a small number of customers but it was smaller than expected. For the customers impacted they responded positively to our justifications and accepted the proposed code fixes to move forward.

I'm one of the main people who does customer feedback triage as well as someone who helps customers migrating to newer versions of the compiler that stumble onto unexpected behavior changes. That gives me a good sense of what pain points exist for tooling migration. This was a small blip when it was first introduced but quickly faded. Even as recently as a few years ago I was helping large code bases upgrade from C# 4. While they do hit other breaking changes we've had, they rarely hit this one. I'm honestly struggling to remember the last time I worked with a customer hitting this.

It's been ~10 years since this change was taken to the language and a lot has changed in that time. Projects have a property <LangVersion> that serves much the same purpose as the Go version in the go.mod file. These days when we introduce a significant breaking change we tie the behavior to <LangVersion> when possible. That helps separates the concept for customers of:

Acquiring a new toolset. This comes when you upgrade Visual Studio or the .NET SDK. We want these to be friction free actions so customers get latest bug / security fixes. This never changes <LangVersion> so breaks don't happen here.
Moving to a new language version. This is an explicit action the customer takes to move to a new .NET / C# version. It is understood this has some cost associated with it to account for breaking changes.

This separation has been very successful for us and allowed us to make changes that would not have been possible in the past. If we were doing this change today we'd almost certainly tie the break to a <LangVersion> update

5 replies

rsc Oct 3, 2022
Maintainer Author

Thank you very much @jaredpar for sharing this perspective!

mdempsky Oct 3, 2022

@jaredpar Thank you very much! That was very informative and insightful.

Do you have any further perspective to share on the decision to change foreach but not for? Was it considered an option to change for at the time? Has there been any discussion since about changing for? (Particularly since ES6 decided to use per-iteration bindings for 3-clause for(let loops.)

Internally within the Go compiler team, this issue has recurringly been discussed as: (1) Go's for/range loop semantics are unfortunate and error-prone; (2) it would also be unfortunate to change for/range loops without changing (non-range) for loops too; but (3) changing for loops is much more controversial still.

The compiler team has been leaning towards changing both, mostly motivated to make sure users can safely switch between for and for/range without having to think about variable binding semantics. But that's not yet a unanimously held position.

magical Oct 3, 2022

changing [non-range] for loops is much more controversial still.

For what it's worth, i'm satisfied with the answers you and Russ provided elsethread. As long as the solution can be narrowly tailored to fix the closure-in-a-loop problem without otherwise changing for loop semantics (which it sounds like it can) then i'm happy.

jaredpar Oct 3, 2022

Do you have any further perspective to share on the decision to change foreach but not for? Was it considered an option to change for at the time

It was considered but rejected.

One issue is that foreach was primarily a C# syntax at the time where for is a more general language syntax. The for loop in C# is really no different than for in C/C++ and a variety of other languages. Changing the semantics of foreach was asking developers to adjust how they thought about C#. But changing the semantics of for was asking developers to think differently about C# than they did for most of the other languages that they used. That's a dicey proposition.

The other thought was that in for loops it was much clearer what the lifetime was. There are explicit declaration with initializations, the conditional check and increment. Taken together the feeling was it was more obvious that the lifetime was for the entire loop. Anecdotally that matches the data I've seen in terms of bug reports over the years.

Has there been any discussion since about changing for? (Particularly since ES6 decided to use per-iteration bindings for 3-clause for(let loops.)

We haven't had any serious discussions about it that I remember.

A bit of context that may not be as obvious from the outside. At that time the C# team was extremely sensitive to breaking changes. Given we couldn't leverage <LangVersion> any toolset upgrade meant all customers saw all breaking changes. Even small breaks could result in significant push back from customers who just wanted to upgrade the tools. As such the team went to great lengths to avoid them whenever possible. Changing foreach was one of the biggest breaking change the language had really taken on. Going after for at the same time likely would've been a bridge too far. If we were approach the same discussion today I suspect there would be a stronger desire to change both.

zigo101 Oct 5, 2022

Are there any valid serious use cases preferring the current semantics? If there are none, the C# way is better than the go.mod way in my opinion. Most code readers don't care about the go version set in go.mod at all. The fact that the same piece of code behaves differently is a bad user experience and a potential security risk for many projects.

thepudds · 2022-10-03T16:55:17Z

thepudds
Oct 3, 2022
Collaborator

IIRC, I think the meaning of an absent go.mod or absentgo directive in a go.mod has changed one or two times. Circa Go 1.13, I think a missing go directive was interpreted as the current release. On the other hand, the current doc says:

As of the Go 1.17 release, if the go directive is missing, go 1.16 is assumed.

A somewhat minor detail, but as part of this discussion, it might be nice to explicitly state either that assumed version won’t change again, or that if this range variable change was hypothetically in go 1.30, then a missing go.mod or missing go directive would never be interpreted as > 1.29, or something along those lines.

1 reply

rsc Oct 3, 2022
Maintainer Author

Yes, we kept the implied version for an absent go line / go.mod file bumping forward for as long as the releases were completely compatible (or close enough). That stopped at Go 1.17, as you note. Since then, all releases assume that code without a go line gets Go 1.16 semantics. That is set in stone now and will never change.

VinGarcia · 2022-10-03T17:27:15Z

VinGarcia
Oct 3, 2022

We actually had an in-depth discussion about precisely this problem this morning in the Go Brasil telegram group (we are now joking you guys are spying on us hehe).

One thing to notice in this discussion is that even after having this problem explained multiple times by different people multiple developers were still having trouble understanding exactly what caused it and how to avoid it, and even the people that understood the problem in one context often failed to notice it would affect other types of for loops.

So we could argue that the current semantics make the learning curve for Go steeper.

PS: I have also had problems with this multiple times, once in production, thus, I am very in favor of this change even considering the breaking aspect of it.

2 replies

benhoyt Oct 3, 2022

This exactly matches my experience. It's relatively easy to understand the first example (taking the same address each time), but somewhat trickier to understand in the closure/goroutine case. And even when you do understand it, one forgets (apparently even Russ forgets!). In addition, issues with this often don't show up right away, and then when debugging an issue, I find it always takes a while to realize that it's "that old loop variable issue again".

puellanivis Oct 6, 2022

I also find there to be a frustrating split of “workarounds” for closures. Some people recommend redefinition, and others assignment through a function parameter:

for _, thing := range things {
  thing := thing // explanation for people who don’t understand the pattern
  go func() {
    do(thing)
  }()
}

vs.

for _, thing := range things {
  go func(thing Thing) {
    do(thing)
  }(thing)
}

I’ve tried to explain that the first option, while seemingly non-functional, is the better option: it uses type inference so a change in type name does not require a change in this loop; it shadows the per-loop thing variable meaning you cannot accidentally reference the per-loop value; definition and assignment precede use. While the latter additionally requires: going to the end of the loop to verify that the assumption that per-loop thing is assigned as the argument for the parameter thing in the anonymous function (that is, definition precedes usage, but usage precedes assignment); there is a tendency of people to dislike shadowing variable names, which leads them to using a different name for the parameter from the per-loop variable, which leaves the per-loop variable exposed and referenceable.

In the end, both patterns “solve” the problem, but neither of them in a wholly satisfactory way. The per-iteration redefinition provides the most concise and robust code, but really calls for documentation every time because maybe this is the first time a person has come across this pattern, and might think it superfluous.

All in all, I’m pretty happy to see this semantic change. Work around for more efficient only-once allocation being a welcome burden for not having to explain this extremely common bug every time it pops up. The only-once allocation being far more clear as a result as well, which doesn’t end up looking unnecessary:

var thing Thing // this is _obviously_ not per-iteration allocation, and requires no explanation
for _, thing = range things {
  do(thing)
}

magical · 2022-10-03T17:47:23Z

magical
Oct 3, 2022

I'm totally on board with changing for range loop semantics.

The case for changing the 3-clause for loop is less clear to me. I'm worried about code which intentionally modifies the loop variable in the loop body. E.g.

for i := 0; i < len(a); i++ {
   if a[i] == "" {
      a = append(a[:i], a[i+1:]...)
      i--
   }
}

Maybe not the best example -- but the point is that such code, while rare, almost certainly exists and could be silently broken by this change.

Note that this argument does not apply to for range loops because, unlike the 3-clause variant, the programmer does not have direct access to the loop variable.

I suppose the fix in this case is to move the variable declaration out of the for loop, though that seems almost as inscrutable a change as adding i := i.

-for i := 0; i < len(a); i++ {
+i := 0
+for ; i < len(a); i++ {

In general, 3-clause for loops are what you reach for when doing something "tricky". The exception, of course, is when you want a simple loop over a range of integers, in which case the 3-clause form is the only option. If there were a dedicated for i := range 0..n form (#180, #36308, #44394, #24282) then i think the 3-clause variant would become vanishingly rare.

15 replies

mdempsky Oct 4, 2022

@DeedleFake This discussion is only about changing loops that explicitly declare variables. Loops that do not declare variables, like your example, are unaffected

DeedleFake Oct 4, 2022

Oh, duh. After all, without a declaration, how would the compiler know which variable to copy and update on each iteration?

lesam Oct 4, 2022

I think the argument in #56010 (reply in thread) is a stronger reason to prefer leaving 3-argument for as-is, than the correctness arguments here.

It seems that the 3-argument for loop construction looking identical to many other languages, but being in fact slightly different, would be something difficult to teach and difficult for new learners to retain (since it generally 'just works' regardless).

hherman1 Oct 4, 2022

the 3-argument for loop construction looking identical to many other languages, but being in fact slightly different, would be something difficult to teach and difficult for new learners to retain (since it generally 'just works' regardless).

So would the 3 argument for loop looking similar to the range loop and working slightly differently

puellanivis Oct 6, 2022

The proposal that I read (unless rsc updated it between your comment and me reading it), is clear that in the 3 argument for loop, the per-iteration variable’s value would be copied into a per-loop variable prior to operation of the 3rd argument.

Thus, your presented code should still work fine:

for i := 0; i < len(a); i++ {
   if a[i] == "" {
      a = append(a[:i], a[i+1:]...)
      i-- // the value of the per-iteration loop set here will be copied into the per-loop `i` prior to the `i++`
   }
}

bhcleek · 2022-10-03T18:02:08Z

bhcleek
Oct 3, 2022

vendored dependencies

How would vendored dependencies be handled?

My concern is that since go mod vendor no longer vendors the go.mod file of dependencies, that the go directive of the main module would determine how the loop variable is handled, resulting in inconsistent behavior within the dependency itself depending on the go directive in the main module.

3 replies

bcmills Oct 3, 2022

For modules at go 1.17 or higher, go mod vendor preserves the go versions for dependencies (see https://go.dev/doc/go1.17#vendor / #36876).

bhcleek Oct 3, 2022

Thanks, @bcmills. I thought that was the case, but couldn't find it. I realize now that the reason I couldn't find it in modules.txt is that I happened to use a dependency that isn't a Go module.

How would loop variables be treated in those kinds of cases?

rsc Oct 3, 2022
Maintainer Author

Un-Go-versioned dependencies are treated as being written for Go 1.16 already for certain module analyses. The same would apply here.

Merovius · 2022-10-03T18:39:16Z

Merovius
Oct 3, 2022

If we do this, I would like us to be extra-loud about it. That is, if go mod edit is used to push the go line past 1.30, print an explicit warning about this change. Or something. Maybe based on running some static analysis; even if it isn't generally good, ISTM that a one-time "here is a checklist of all the loops you might want to have a look at" is more tolerant to false-positives than a general purpose linter.

I'd also like a tool that removes all my x := x lines, but that's more of a luxury 😆

2 replies

puellanivis Oct 6, 2022

Any message that does not break the build process is going to be overlooked by at least some people and all automation. There’s kind of the idea in Go that if something is worth warning about, then it’s worth breaking the build/automation.

That would mean we would want to stop any go mod edit that tries to advance past 1.30 in the prescribed instances.

ikiris Oct 11, 2022

This is what tests are for, not obscure warning messages to the final user that they then must go spelunking about because there might be a problem somewhere.

ardan-bkennedy · 2022-10-03T18:56:10Z

ardan-bkennedy
Oct 3, 2022

For what it's worth, I really like this change. I can easily teach to this and it strengthens the value semantic aspects of that loop. The ideas of how to use go.mod seem reasonable and valid.

0 replies

dlsniper · 2022-10-03T19:09:36Z

dlsniper
Oct 3, 2022

This is a good change and would make using table tests a lot easier to handle due to the per-instance rather than per-loop expectations.

As a further enhancement, perhaps better discussed in a different topic, would it be possible for the Go compiler to flag when such upgrades will change the way the code works and inform the users about it?

In your example:

The first compile is done with Go 1.29 and everything works as expected.
The second compile will use Go 1.30.
During this second compilation, the user receives messages similar to: Since Go 1.30, the loop on file:line will now produce different a different effect. See documentation-link to learn more about this change.

What do you think?

Edit:
The main problem this solves is related to people not following Go developments. When they upgrade to a newer Go version, and the compiler suddenly decides to work in a different way, at least from their point of view, it results in confusion. Having a way to determine what the compiler does differently between versions would be of great help and encourage people to safely upgrade their code. Maybe a different tool would solve it?

2 replies

rsc Oct 3, 2022
Maintainer Author

We can definitely provide tools for people to analyze their code to understand likely sources of changes, and we would absolutely do that.

I'm not sure it makes sense to track "was this code ever compiled with Go 1.30 before?" and print information during an ordinary build. That question can't be answered reliably (what if the upgrade happened on a different machine and was checked into the repo and you just ran git pull?).

dlsniper Oct 3, 2022

I'm not sure it makes sense to track "was this code ever compiled with Go 1.30 before?"

That's true. Thinking a bit more about this, maybe a (standalone?) Go tool that takes the compiler, with all its flags and, for a given codebase, runs all the Go versions between "current" and "target" and produces the following output:
Upgrading from current (Go 1.15) to target (Go 1.18) will result in the following changes:

Change 1 - file:line caused by since Go 1.17
Change 2 - file:line caused by since Go 1.16
Change 3 - file:line caused by since Go 1.18
...

The tool could keep track of more than just compiler changes. Deprecations, library behavior changes for functions, etc, could all be part of this.

Then the developers could run this tool and provide a clear upgrade path. This would make it easy to determine if/what work is required for developers to upgrade a codebase.

Edit: Integration with editors/other tools would then help people discover these issues easier.

cespare · 2022-10-03T19:17:15Z

cespare
Oct 3, 2022
Collaborator

There are two aspects of this that I'd like more details on, either as part of this discussion or in a future proposal:

I'd like more specifics about code that would go from correct to broken by this change. The post contains a text description of two of these:

One involved a handler registered using once.Do that needed access to the current iteration’s values on each invocation. The other involved low-level code running in a context when allocation is disallowed, [...]

I'm not sure I understand the first one. Could you give some pseudo-code? In general, I'm interested in any examples anyone can give of real (or even hypothetical) code that depends on the current semantics. (I don't think I've ever written code like that.)
My main concern is about the performance implication of this change. This is briefly mentioned:

Benchmarking of the public “bent” bench suite showed no statistically significant performance difference over all [...]

but it would be useful to read a comprehensive discussion from the compiler folks about this. What kinds of loops compile to exactly the same code? What kinds of loops will change in a way that probably won't matter? What kinds of performance-critical code might see regressions and require adjustment? (What kind of adjustment? Just moving a variable outside the loop?) What kinds of compiler optimizations might be implemented (either at the same time or in the future) to ameliorate those issues?

6 replies

rsc Oct 3, 2022
Maintainer Author

The code looked like, roughly:

var once sync.Once
for _, tt := range testCases {
    once.Do(func() {
        http.HandleFunc("/handler", func(w http.ResponseWriter, r *http.Request) {
            w.Write(tt.Body)
        })
    })
    
    result := get("/handler")
    if result != tt.Body {
        ...
    }
}

It took us a while to decode this too, and in fact the original was not as clear. (Not all code in the world is well written!)

If you compile your package with -gcflags=-m and see a moved to heap: i message about your loop variable, those will be allocated once per iteration instead of once per loop. We don't have a diagnostic for the case where a variable escapes from the loop but not the function. For example:
```
 % cat /tmp/x.go
 package p
 
 var global *int
 
 func f(x []string) {
 	var last *string
 	for i, v := range x {
 		if len(v)%3 == 0 {
 			global = &i
 			last = &v
 		}
 	}
 	println(*last)
 }
 % go build -gcflags=-m /tmp/x.go
 # command-line-arguments
 /tmp/x.go:5:6: can inline f
 /tmp/x.go:5:8: x does not escape
 /tmp/x.go:7:6: moved to heap: i
 % 
```
In this program, in the new semantics both i and v would need to be heap allocated once per iteration: i because its address escapes to a global, and v because its address escapes outside the loop context (to last). Today i is allocated once per loop, while v is not allocated at all.

As part of the process (in advance of whatever release made the change) we would provide a tool that gives a more accurate accounting of the places where allocation behavior (and semantics, same places) would change.

dr2chase Oct 3, 2022
Maintainer

The once.Do example would have been better written by hoisting the shared variable declarations outside the loop and doing the stubbing once, there. I thought it was very confusing code.

With the current implementation, if there is no apparent capture, the loop is not changed. I say "apparent" because the detector has to run before escape analysis and thus it is over-conservative; it looks for mention within closures and address-taking (address-taking also occurs when passing a V to a *V method or constructing a method value). I can see how in the future we'll have a discussion about "technical debt in the compiler" because its internal representation of for/for-range will have old-style capture, but doing it this way ensures usual-case no-overhead and makes it easier to detect where the change occurs.

If there is apparent capture in for-range, but escape analysis determines there is no escape, then it is very nearly the same loop.

If there is escape, then there will tend to be a new allocation for each iteration, and if that is bad, hoist the declaration prior to the loop and use "=" in the range, as in:

var i int
var v Value
for i, v = range whateverYieldsValues {
  ...
}

but escape analysis works well.

3-clause for loops with capture are much rarer (97% less common than range loops, over non-test Google code, and they didn't show up in tests at all), but the same treatment of escaping variables applies. The transformation we copied from JavaScript introduces a little branchy overhead that might not come out in optimization (but we might target that optimization in the future), but (1) if there is no increment clause, it can be omitted (that's already implemented) and (2) the same declaration-hoisting change that works for for-range also eliminates the 3-clause per-iteration variable, and thus eliminates the transformation and thus eliminates the branchy code.

So the workaround is not hard in either case.

There's also the possibility of a tool that could, if this change causes failure and you can't easily figure out where, pinpoint exactly the function and loop where it goes wrong, if we could figure out the right packaging for the tool. This would use the GOSSAHASH compiler debugging technique combined with automated binary search (that program is already written), specialized to just this problem. It would probably be designed to work with "go test".

aklepatc Oct 15, 2022

var once sync.Once
for _, tt := range testCases {
    once.Do(func() {
        http.HandleFunc("/handler", func(w http.ResponseWriter, r *http.Request) {
            w.Write(tt.Body)
        })
    })
    
    result := get("/handler")
    if result != tt.Body {
        ...
    }
}

@rsc could you please explain this? w.Write(tt.Body) implies that tt.Body is []byte. Then equality check if result != tt.Body { should not compile? Thank you.

Sorry for late (and possibly already answered) question.

Merovius Oct 15, 2022

@aklepatc I assume that's an error introduced by the reproduction from memory. You can either replace != with !bytes.Equal or w.Write(tt.Body) with io.WriteString(w, tt.Body) to fix it.

The explanation of that code is that when tt.Body is used in the once.Do, it closes by reference - tt is always an implicit pointer to the single loop variable tt. If you run this sequentially, get thus returns the value of tt.Body in that run of the loop, not the first one where the handler was registered. Thus, the test works. But it is far too clever for its own good (or, more likely, accidentally correct).

A cleaner way to do that would be

var body string
http.HandleFunc("/handler", func(w http.ResponseWriter, r *http.Request) {
    io.WriteString(w, body)
})
for _, tt := range testCases {
    body = tt.Body
    result := get("/handler")
    if result != tt.Body {
        ...
    }
}

aklepatc Oct 15, 2022

Ty @Merovius ! Bytes equality check was the only thing that didn't make sense about this code snippet. The rest was reasonably clear.

hyangah · 2022-10-03T22:57:35Z

hyangah
Oct 3, 2022
Maintainer

I am glad to see this.

Currently the go happily compiles a module even when its go.mod specifies a higher version as long as the module does not use a new language feature introduced in the higher version. For example, currently (as of today), gopls has go. 1.18 in its go.mod, I can still compile it with go 1.16 and even go 1.15. It is also true that I can import a module as a dependency even when the module has higher go version in its go.mod and I am stuck with an old version of go - as long as the dependency does not use a new language feature.

However, for this transition, this won't be the case. Am I understanding it correct?

More specifically, in the example that hypothetically assume the change is made in go 1.30, I think any attempt to compile or import the work module (with go 1.30 in its go.mod) with go 1.29 or older should report errors. Otherwise, code in main.go can be compiled differently.

And non-module based build systems (e.g. bazel) will also need a plan to move forward.

5 replies

timothy-king Oct 3, 2022

More specifically, in the example that hypothetically assume the change is made in go 1.30, I think any attempt to compile or import the work module (with go 1.30 in its go.mod) with go 1.29 or older should report errors.

I think we will want authors of modules to be able to upgrade their module to 1.30 without breaking their users (my understanding of forward compatibility). I do not feel we should consider this a breaking API change as it is an internal implementation detail. @rsc discussed forward compatible changes in #55092.

It could be reasonable for tooling to start warning about "may escape for/for range vars" once you have a mixed version workspace. The basis would be that it is a readability challenge to switch back and forth while navigating code.

And non-module based build systems (e.g. bazel) will also need a plan to move forward.

To facilitate upgrading from python 2 to 3, bazel has a python_version field for pybinary and srcs_version for pylibrary. Something similar can be added to go_library to facilitate large scale bazel migrations.

kaey Oct 4, 2022

This is discussed in #55092

No matter how the effective toolchain is specified, it would become a build error to attempt to use an old toolchain with new code.

rsc Oct 4, 2022
Maintainer Author

Yes, I believe shipping #55092 first would make it safer to do this change.

jhenstridge Oct 6, 2022

@rsc: reading through that proposal, it sounds like it would only upgrade the toolchain to the Go version of the main module of the project.

If the installed toolchain is Go 1.X, the main module's go.mod says go 1.Y, and a dependency module's go.mod says go 1.Z with X < Y < Z, would this work correctly if the loop variable changes are introduced in Go 1.Z?

brainy Oct 11, 2022

This question puzzles me as well, and I can't think of a way to make it work without upgrading the toolchain from 1.X to 1.Z (which may not be possible). Go 1.X can't know about the semantic change in 1.Z, so it couldn't even emit a warning other than a generic version-mismatch.

ncruces · 2022-10-04T09:36:29Z

ncruces
Oct 4, 2022

The proposal for the 3-clause form, with implied copies at the start and end of the iteration, seems inherently racy in the presence of goroutines that outlive the iteration. Is there any remedy for this, or do we live with the raciness of possibly changing the variable before the iteration proceeds?

9 replies

puellanivis Oct 6, 2022

It's contrived, but this code is well defined today but becomes racy under the proposed semantics:

😬 I don’t feel well with calling the proposed code “non-racy” just because it is not triggering the race-condition detection. I would in fact strongly assert that this is absolutely already racy code despite not being caught by the race detector.

That one can design a precarious piece of code that is both racy but does not trigger the race detector is indisputable, but one kind of has to expect that subtle changes can always easily “unexpectedly” break such precarious code.

ncruces Oct 6, 2022

@puellanivis, maybe you missed this detail, but the original is non-racy because the iteration only runs once (i < 1).

Increment is on a goroutine started by the original goroutine, so initialization happens-before increment. As those are the only two accesses, it's fine.

This change adds a third access at the end of the iteration, which races with the child goroutine.

puellanivis Oct 6, 2022

@puellanivis, maybe you missed this detail, but the original is non-racy because the iteration only runs once (i < 1).

No, I did not miss that the code does not trigger the race detector because it’s secretly basically single-threaded. However, changing that 1 to a 2 and you now trigger the race detector, adding any access to x in the main body of the loop, or the 2nd or 3rd statements of the for loop, all also triggers the race detector.

This is specifically why I called the code “precarious”. It is inherently racey even though it does not trigger the race detector, which it does only because it has been specially designed to evade the race detector.

ncruces Oct 6, 2022

It doesn't trigger the race detector because there is no data race, not for a flaw in the race detector.
I think the distinction is important, because a read of your comments, to me, implies otherwise.

That the example is contrived is explicitly stated, and acknowledged in the reactions.
I similarly acknowledged that I failed find convincing examples of races introduced unwittingly by this change of behaviour.

PS: adding an access to x in the main body of the loop before starting the goroutine is not a data race either (starting a goroutine happens-before it running).

Merovius Oct 6, 2022

@puellanivis FWIW I agree with @ncruces and @mdempsky. The code is not racy. That it is easy to change it into racy code does not change that. Most race-free code can easily be modified into being racy.

It's not good code and it's certainly precarious code, which is why @mdempsky said it's contrived. But it currently has well-defined semantics and does not contain a race and it would contain a race under the proposed change.

However, when entertaining this change we are already accepting that it would break compatibility and trying to quantify that breakage. I hope we all agree that this example is sufficiently contrived not to measurably change "how breaking" changing loop-variables would be.

erincandescent · 2022-10-04T11:59:34Z

erincandescent
Oct 4, 2022

If theres a desire to de-risk this against breakage to existing code, perhaps one option would be to make capturing a loop variable a compilation error for a version or two?

<v1.X: Current behaviour
>=v1.X, <v1.Y: Compilation error
>=v1.Y: New behaviour

1 reply

rsc Oct 4, 2022
Maintainer Author

Not all loop captures are bugs. The vast majority of loop captures today are fine. The "compilation error" phase would break all of today's correct code, causing unnecessary churn.

mhagger · 2022-10-04T13:29:55Z

mhagger
Oct 4, 2022

I like this change and think it would eliminate one of the biggest footguns in Go.

Is it possible to mechanically rewrite old code so that it compiles, under the new semantics, to have exactly the pre-change behavior? Granted, this tool would need to be conservative in the cases where a static analysis tool can't be sure whether the change in for-loop semantics will change the program behavior.

For example,

var all []*Item
for _, item := range items {
	all = append(all, &item)
}

is probably broken code. But suppose that we wanted to compile this under the proposed new for-loop semantics while getting the old behavior. That might look like

var all []*Item
var item Item
for _, item = range items {
	all = append(all, &item)
}

(assuming that item is not redeclared later in the function; otherwise a new scope might have to be introduced or a unique variable name invented). Or, more to the point,

var item Item
for range items {
	all = append(all, &item)
}

If such a rewrite tool existed, then a conservative workflow for transitioning a project to the new for-loop semantics could be:

Git clone repository
Run "rewrite" tool that converts go.mod to 1.30 and also rewrites all affected loops in such a way that they are guaranteed to retain the old semantics.
git diff to inspect the automated code changes. Revert the changes that are unwanted (probably all of them!), either because they were false positives or because the rewriting of the code has made pre-existing bugs obvious.
git commit the upgrade to 1.30 plus any changes (probably none!) that are actually desired.

Even better would be for the rewrite tool to delete any now-provably-unnecessary "foo := foo" lines, so that old code can benefit from this simplification.

Such a tool would make it easy for people to find the places in their code that might be affected by the new semantics, Though I also see some danger that many people might just commit the rewritten code, thus perpetuating any pre-existing bugs that might otherwise have been fixed by the transition to the new semantics. Therefore, another option might be for the "rewrite" tool to insert commented-out code, or just add comments that highlight the places where the behavior might change under the new for-loop semantics.

2 replies

rsc Oct 4, 2022
Maintainer Author

That tool can be written and probably will be. Given how exceedingly rare the old behavior is the correct one, I am not 100% sure that using such a tool is the right way to approach a migration. The "git diff" is going to be error-prone and tedious. If you have good tests, testing and looking into failures is probably a better approach. But the tool will be important to have nonetheless.

timothy-king Oct 4, 2022

To have a good implementation of such a tool it is important to have somewhat good escape analysis. You do not want all conversions to an interface or method calls to a pointer receiver to cause a "may escape" warning. If "(adjusted -m output)" happens like Russ mentioned, we have some preliminary evidence such a tool is likely to be reasonable for many people but not churn free though. (Take a look at the "Changing the semantics is usually a no-op" section for the evidence available.)

Therefore, another option might be for the "rewrite" tool to insert commented-out code

I had not yet considered creating a TODO list via comments yet. Such comments could searchable if the text is somewhat unique. It can be applied pre-transition. This would not be churn free, but it could be tackled over time, given as an onboarding project, additional tooling, etc. It may be a nice alternative to hosting the declaration before the loop, which could be forgotten (and is more likely to keep old bugs). Thanks for the idea.

mattn · 2022-10-04T13:31:01Z

mattn
Oct 4, 2022
Collaborator

Not sure, but I am worry about that this change will work correctly when using "go generate" between mixed versions. For example, if we generate Go codes for an older version with this change in effect. Of course, this is the generator's responsibility to consider older versions, but we believe it must be considered.

1 reply

rsc Oct 4, 2022
Maintainer Author

A fair point, and yet another reason for the general rule that we don't make breaking changes. Generators would need to emit code that works with either semantics, but given the very low rate of code that is correct today and incorrect with the changed semantics, I suspect the vast majority of generators are fine already.

kalexmills · 2022-10-04T13:46:32Z

kalexmills
Oct 4, 2022

It could be helpful to provide a link to this related prior work around determining the scope of the problem on GitHub.

https://github.com/github-vet

2 replies

rsc Oct 4, 2022
Maintainer Author

Thanks. I don't remember seeing that project before. I'm curious what analyzer it is using. I clicked on three issues at random from the first page of issues in rangeloop-pointer-findings, and only one of them is a real bug:

theanalyst/rlyeh: main.go; 23 LoC github-vet/rangeloop-pointer-findings#18446
real bug
LLLSic/fuwujisuan: demo/entity/storage.go; 5 LoC github-vet/rangeloop-pointer-findings#18439
not a bug, assuming filter doesn't keep &v
google/knative-gcp: pkg/reconciler/intevents/pullsubscription/reconciler.go; 5 LoC github-vet/rangeloop-pointer-findings#18430
not a bug, assuming IsControlledBy doesn't keep &dep

kalexmills Oct 4, 2022

Yep! The goal of this project was to find all the instances with false positives and rely on crowdsourcing to filter through the false positives. That didn't pan out.

IIRC, the analyzer is based on looppointer with some additional customizations. It's been a while since I worked on it. The initial pass didn't use the type-checker. I started bringing in type-checking before being pulled away by a new job.

thejerf · 2022-10-11T14:31:46Z

thejerf
Oct 11, 2022

I am in favor of this; an experience report from C# is hard to trump.

But I want to relay some experience from helping people on /r/golang. Many people are bringing in concepts from dynamically typed languages like Python when they come to Go. I have found it very productive to highlight to them the importance of allocation in Go. It is easy to explain than when you say var or :=, you are allocating something. := already kind of obscures this because it must allocate at least one value but it won't reallocate existing values, but in general I've had good success teaching people to understand Go by understanding the importance of allocation in Go and that a variable represents a particular location, rather than being a post-it note that can be moved from value to value like in Pytho. It generally helps them understand this issue, why maps work they way they do with values vs. pointers, why := in a for loop is allocating each time, etc., and just generally cleans up a lot of little misunderstandings they have.

Go's nice syntax glosses on the act of allocation also in practice prevent people from coming to this understanding. It's theoretically documented in the language spec, of course, but if you don't know to look out for it, you won't see it, and these people don't know to look out for it.

This change will weaken the message I've been giving people about how one var or := generally corresponds to one such allocation, because now one := in the source code will represent an allocation per loop. Not a fatal problem, but a more complicated message.

I would suggest that pairing a blog post announcing this with a discussion about the importance of allocation in Go, how that differs from dynamic languages, and how it provides a unified understanding of many things considered "quirks" in Go would be a useful resource I'd link many people to over time. It might even make a good hook for the whole post.

I'm willing to sketch such a thing out if there is interest.

0 replies

laoshaw · 2022-10-11T14:48:54Z

laoshaw
Oct 11, 2022

python has similar "issue", just document it and put it into a small set of 'idiomatic rules', no need for break changes to me.

1 reply

Merovius Oct 11, 2022

It is documented, the guidelines exist, the bugs are still introduced. We know for certain, based on empirical evidence, that this is not a solution.

hydrogen18 · 2022-10-11T15:08:27Z

hydrogen18
Oct 11, 2022

I really do not see this as a useful change. These changes always have the best intentions, but the reality is that the language works just fine now. This well intended change slowly creep in over time, until you wind up with the C++ language yet again. If someone can't understand a relatively simple design decision like this one, they are not going to understand how to properly use channels and other language features of Go.

Burying a change to the semantics of the language in go.mod is absolutely bonkers . It's supposed to be controlling modules, not the semantics of the language. If you really want to do this, then go.mod needs to be deprecated and replaced with go.config.

1 reply

Merovius Oct 11, 2022

Burying a change to the semantics of the language in go.mod is absolutely bonkers. It's supposed to be controlling modules, not the semantics of the language.

We already use go.mod to control the usage of other language features, like type parameters. That's exactly the intention behind the go directive in go.mod, which was added a long time ago.

cheesycod · 2022-10-11T17:19:57Z

cheesycod
Oct 11, 2022

Honestly, this should just be a //go:newfor comment or a prefor statement instead of a language change

OR we can have a import that enables the semantics per file

I for one despise breaking changes to the language that arent opt in (i like opt in so they can be reviewed at my own convenience without losing other new features of the language). Per file would be better if possible

3 replies

Merovius Oct 11, 2022

this should just be a //go:newfor comment

This suggestion is addressed in the Go 2 transition document:

There is no obvious way to ever remove any of these special imports, so they will tend to accumulate over time. Python and Perl avoid the accumulation problem by intentionally making a backward incompatible change. After moving to Python 3 or Perl 6, the accumulated feature requests can be discarded. Since Go is trying to avoid a large backward incompatible change, there would be no clear way to ever remove these imports.

a prefor statement

If I understand you correctly, this is addressed in the "Changing loop syntax entirely would cause unnecessary churn" section of the top post.

If you have new data or new, so far not considered arguments, they would be welcome.

timothy-king Oct 11, 2022

Per file controls is somewhat doable via build tags or a similar mechanism. That does not really solve what you are discussing though. My main concern about per file control is that this is not super compatible with the rest of the tooling and build ecosystem. (Kinda minor technical point but it does matter.) Per package controls would be quite a bit smoother to build today. That is a smaller unit that per module. But it is not as fine grained as per file or per loop.

FWIW there are discussions of having a tool that would annotate/rewrite existing loops that may escape. This could be done before or during an update. Doing this step would be opt-in. This would let folks that want to review over time do so. I realize this is not quite what you are requesting, but it may cover some concerns.

peterebden Oct 11, 2022

If the author knows to use special annotations, they at least probably already know how to avoid this. It's a better outcome to make it be less surprising for developers less familiar with Go.

skybrian · 2022-10-11T21:08:06Z

skybrian
Oct 11, 2022

It seems like a good idea, but what about making some small syntactical change to for loops at the same time, such that the old syntax doesn't compile and the new syntax works after changing go.mod?

Otherwise, when reading for loops in Go code (which might be in a context where we don't have quick access to go.mod - consider a code review or someone asking for help), we won't know which semantics apply: the old way or the new way?

If there's a change in syntax (which can be made automatically at the same time as changing go.mod) then the old syntax will look increasingly dated and people will be aware that they're looking at code that's doing it the old way. Or at least, that something weird is going on, if they're only familiar with the new syntax.

I don't have any specific suggestion for improving the syntax, though; the difficulty would be coming up with something people like.

1 reply

Merovius Oct 11, 2022

what about making some small syntactical change to for loops at the same time, such that the old syntax doesn't compile and the new syntax works after changing go.mod?

Please see the section "Changing loop syntax entirely would cause unnecessary churn" in the top post.

marcusatzimpler · 2022-10-11T21:22:08Z

marcusatzimpler
Oct 11, 2022

I 100% support this.

0 replies

htemelski · 2022-10-12T07:05:33Z

htemelski
Oct 12, 2022

I am generally positive about this proposition.
It makes perfect sense for the for ... := range ... loop to define its parameters per-iteration.
But I'm not so sure about the 3 clause loop. for i := 0; i < x; i++ sometimes you have to modify i within the loop, which as far as I understand would be impossible if i is per-iteration, and it would lead to the following code

i := 0
for ; i < x; i++ {

which is far from elegant

In the cases that the go team decides to keeps, the two types of loops consistent I oppose this proposition

2 replies

Merovius Oct 12, 2022

From the top post:

In the 3-clause form, the start of the iteration body copies the per-loop i into a per-iteration i, and then the end of the body (or any continue statement) copies the current value of the per-iteration i back to the per-loop i. Unless a variable is captured like in the above example, nothing changes about how the loop executes.

The loop variables are copied back-and-forth so nothing changes. You can still modify i within the 3-clause loop and the modification will be copied to the variable for the next iteration.

htemelski Oct 12, 2022

Oh, I've missed that.
In that case, I see no problems with the proposition.

LukasGelbmann · 2022-10-12T18:59:55Z

LukasGelbmann
Oct 12, 2022

I'm only a casual user of Go so please take my comment with an appropriate grain of salt.

I'd welcome a change in semantics in the range loop, but caution against a change in the three-clause form.

I think there's an important difference between the range loop and the 3-clause loop: a non-expert reader of the range loop can't see "how often declaration happens", while in the 3-clause loop declaration clearly seems to happen just once.

Take a range loop like this:

for _, item := range items {
    all = append(all, &item)
}

This loop can easily be read by non-experts either as (pseudocode!):

var item Item
while hasNext(items) {
    item = next(items)
    all = append(all, &item)
}

Or alternatively:

while hasNext(items) {
    item := next(items)
    all = append(all, &item)
}

The syntax offers only weak hints to decide which interpretation is the right one. The reader of the range loop needs to know which interpretation is right before they can tell whether there is a bug.

On the other hand, take a 3-clause loop like this:

for i := 1; i <= 3; i++ {
    prints = append(prints, func() { fmt.Println(i) })
}

This syntax, to me, implies that:

i is declared only once (because i only gets the value 1 once).
All references to i are referring to the same variable.

So to me, this loop reads unambiguously as (pseudocode):

i := 1
while i <= 3 {
    prints = append(prints, func() { fmt.Println(i) })
    i++
}

This means the error in the 3-clause loop (under current semantics) is clear upon close inspection, if one knows how closures work in Go. Changing the semantics and adding an implicit new declaration of a separate i subverts expectations set up by the syntax, in my opinion.

Conretely, I'd caution against a change in the three-clause loop for 3 reasons:

The new semantics of the 3-clause loop are counterintuitive. Correct code looks like a bug under the new semantics, and bugs could look like correct code.
The new semantics of the 3-clause loop could lead novices and intermediate users to build a wrong mental model of the language.

When a novice reads code like the 3-clause loop above, and they see that it works, what are they going to think? I think it's easy to build a mental model where a closure captures the current value of a referenced variable (i). Which is completely wrong and leads to problems down the line.
The benefits of changing the 3-clause loop are smaller than the benefits of changing the range loop.

I don't have any data on this and may be wrong, but I think changing the 3-clause loop would fix fewer existing bugs, and also prevent fewer future bugs, than changing the range loop.

0 replies

skullchap · 2022-10-13T04:22:53Z

skullchap
Oct 13, 2022

add warning in linter for existing behavior when trying to use address of loop variable
allow such construct

for _, pItem := range &items {
	all = append(all, pItem)
}

0 replies

serkakres · 2022-10-13T09:14:32Z

serkakres
Oct 13, 2022

Is it too bad having 2 keywords for iterating? for loop uses same variable in all iterations, foreach loop redeclares variable in each iteration.
I prefer a new foreach loop for redeclaring loop variable in each iteration. So there will be no need to change current for loop semantics.

var all []*Item
foreach _, item := range items {
	all = append(all, &item)
}

17 replies

scott-cotton Oct 14, 2022

Thanks for clarifying. I do disagree with the how the arguments associated with that branch of the dichotomy apply to the example of for i,v,p := range slice to conclude 'more churn'. Even more so when considering
the net churn of combination of for I,v,p := range slice applied first with opt-in application to clean up &v then per-iteration semantics.

W.r.t. undo proliferation of ways to express loops causing confusion, I do not feel that for i,v,p := range slice or something like it would be undo proliferation. Certainly, the proliferation of functional+generic looping constructs mixed with the current loop syntax seems worse w.r.t. this criterion.

Merovius Oct 14, 2022

Certainly, the proliferation of functional+generic looping constructs mixed with the current loop syntax seems worse w.r.t. this criterion.

I don't know what you mean by "functional". The generic iterator design does indeed add a new looping construct and thus does add the same level of overhead, yes. And it has to pay for that cost by demonstrating a commensurate benefit. Note that I also pointed out that your suggestion does not actually help solving the problem, as it still requires a programmer to know to use it - and if they know, they can already use x := x. So your suggestion has very little benefit.

scott-cotton Oct 14, 2022

Note that I also pointed out that your suggestion does not actually help solving the problem, as it still requires a programmer to know to use it - and if they know, they can already use x := x. So your suggestion has very little benefit.

It would help me solve the problem, as outlined previously, by providing a convenient construct which encourages avoiding &v within for _, v := range items and can be used as a candidate replacement for &v where that may appear in code, all without anywhere near the backward incompatibility impact of changing semantics. &v was one of the items cited in the code analysis.

I disagree that the problem is requiring an understanding of allocations and scope -- I don't get that impression from the top doc either. Personally, I like that Go encourages understanding of scope and would discourage x := x for what I find to be obvious readability deficiencies. If I had to characterise the relationship between knowledge requirement and the stated problem, I'd say that Go as it is today fails to make understanding the scope+allocations around loop variables clear in the case of &v and closures. Similarly, I think Go is a bit too verbose for capture by value in nested closures.

Both of these problems can be alleviated without changing semantics. Alleviating these problems would in turn reduce the backward incompatible impact of changing the loop semantics, should that occur.

Merovius Oct 14, 2022

It would help me solve the problem, as outlined previously, by providing a convenient construct which encourages avoiding &v within for _, v := range items

You can already do that with &items[i]. So, again: If you know there's a problem, you can already solve it. The problem is that people still make those mistakes. That's the problem we are trying to fix.

timothy-king Oct 14, 2022

@scott-cotton "or incur calculating a slice of pointers." was what I was alluding to. This is obviously not always appropriate, and &item[i] is available for the general case. My main point in suggesting a function that constructs a slice of pointers is that the specific example in the top post has an alternative appealing library solution. While the "Item&" issue is interesting, I hope the conversation will not get too hung up on this specific example.

Personally I actually worry a lot less about the "&v" cases. Those make you write an "&" to give you a hint that an address was just taken and to think about storage duration. The hard to locally reason about cases are closures, interface capture, and method receivers. Those just use v and are influenced by other functions and maybe even functions in different packages.

FWIW an unfortunate characteristic of have some range statement that produces pointers like for index, item, ptr := range items is that items must be a slice, an array or a ptr to an array. It cannot be a map, a channel or a string.

ncruces · 2022-10-14T09:22:44Z

ncruces
Oct 14, 2022

The argument from the top post is that this breaking change fixes more bugs than it causes, even for existing code. And this is an assertion based on having looked at code containing 1 million loops, and only finding 2 instances of current semantics being correctly depended upon, with Ian arguing that "for sure in one case, [probably] in the other case, that the code worked by accident."

At this point, I think it would be very helpful if the people who disagree, to post actual counter examples of this from real code bases. That is: situations where the current semantics are being correctly depended upon (edit: and which the new semantics would break).

Otherwise, we seem to be discussing how a certain rule prevents us from fixing a real bug.

18 replies

Merovius Oct 14, 2022

Do you have a real program which depends on the current semantics? That is the question of this thread. Your original answer was "a program which relies on the spec guaranteeing a copy not happening would", which is simply wrong. The spec makes no such guarantee.

You can discuss that if you find it interesting, but i suggest doing it on a different forum.

zephyrtronium Oct 14, 2022

To me, this implies that variables are not re-declared in each iteration, because the spec refers to assignments (which are '=' not ':=') and continues the pattern of distinguishing declaration and assignment/use in describing what happens on each iteration.

You seem to be talking now about the number of declarations, whereas your original statement was about the number of copies. I think that is a source of confusion here. The spec currently guarantees that a program executes as if there is one copy from a ranged collection to the second range variable per execution of the loop, and that guarantee will not change with the proposal. The spec guarantees that a range loop declares its variables exactly once, which is exactly the thing proposed to be changed. So, your argument now reads as "a program that currently depends on there being one declaration per loop would be an example of a program that currently depends on there being one declaration per loop."

timothy-king Oct 14, 2022

To maybe get back to the original question in the thread, I'll give an example of the sort of thing being requested:

func sum(l []int) int {
  m := make(map[any]int)
  for i, v := range l {
    m[&i] = m[&i] + v
  }
  for _, v := range m {
    return v
  }
  return 0
}

This is an implementation that sums a slice of integers that is correct under the current semantics that would be broken by the new semantics. However this is a contrived example.

Do folks have examples from real code (historical or current) that would transition from correct to broken?

aklepatc Oct 14, 2022

This is very impressive, IMHO. I didn't think a meaningful (however contrived) example like this was even possible.

Merovius Oct 14, 2022

There's also another contrived example here and a real example, which I really liked here. However, even the real example just worked accidentally, not because the author sat down and considered the implications of loop variable allocation and decided to do it this way. And yes, that was the point of this thread, AIUI, trying to find a case where an author fully intentionally and correctly relied on this behavior.

timothy-king · 2022-10-14T18:19:04Z

timothy-king
Oct 14, 2022

One idea would be to release a GOEXPERIMENT flag to turn on compiling using the new semantics. This would allow for folks to run their own unit tests, integration tests, canaries, performance monitoring, etc. under the new semantics. This could be a source of additional data from the community if we think we need more data to make a decision. Later this could be helpful before making a transition for performance sensitive projects.

There is some risk projects would start to rely on this, but this may be relatively small if it is appropriately marked as going away.

2 replies

thepudds Oct 14, 2022
Collaborator

I think @timothy-king makes a good point that real data from more people would be worthwhile.

In case anyone missed it, there is already a CL available for an early prototype from @dr2chase that is very easy to try out:

#56010 (reply in thread)

(And sorry, posting "in case you missed it" comments in a long discussion usually just makes the discussion even longer and even harder to follow, but maybe worth it this one time 😅, including because those comments from @dr2chase are currently stuck inside the GitHub "hidden items" wormhole).

dr2chase Oct 14, 2022
Maintainer

One thing to watch for with the prototype is that it is a work in progress, and it may at some point acquire Go-version awareness. Because of inlining, if/when this happens, the compiler will need to know that this for loop is from version 1.19 and that for loop is from version 1.hasTheChange and compile them differently. IF/WHEN I do this, most likely I'll pretend that whatever the current dev version is has the change, though this is guaranteed to be a lie well into next year (cannot happen before 1.22, assuming proposal approved).

Real data would be interesting, and do note that negative results are valuable here -- if we only hear from people with problems, we'll only have numerator, not denominator. (Negative results -- for the benchmarks in "bent", some CLs out from what's checked in now so it includes a couple more, no notable performance problems, no differences in test failures.)

Maydayor · 2022-10-19T12:48:27Z

Maydayor
Oct 19, 2022

Support it absolutely. But can we have a hint when using different go version?

0 replies

tadhunt · 2022-10-19T16:27:09Z

tadhunt
Oct 19, 2022

I just unexpectedly ran into this problem today and figured I'd chime in with my example and my 2¢. Semantically, my code is registering callbacks to receive notifications on a bunch of different key/value updates from a database.

type Subscription struct {
	  key string
	  fn func(ctx context.Context, value string, ok bool) error
}

subscriptions := []Subscription{
	  Subscription{key1, key1handler},
	  Subscription{key2, key2handler},
	  ...
}

for _, subscription := range subscriptions {
	  database.Subscribe(subscription.key, func(k, v string) {
	  	  // ... do some common stuff ...
	  	  subscription.fn(k, v)
	  })
}

When I ran it, counter-intuitively, key2handler was invoked for changes to the value of both key1 and key2. Took me a while to trace the issue to the one described in this discussion. If I hadn't previously been aware I certainly would have gone down the rathole of thinking I found a compiler issue and wasted a ton of time tracking down the real root cause (that I didn't really understand loop variable semantics).

Not sure if this adds much to the discussion but it's a real world example where the change would have eliminated at least a couple of hours of debugging.

Count me in as in favor of this proposal!

0 replies

rsc · 2022-10-20T04:25:21Z

rsc
Oct 20, 2022
Maintainer Author

This discussion has been very helpful. Thanks to everyone. I think everything people want to say has been said, as evidenced by recent repetition of older comments, so I'm going to close this discussion. Thanks again!

0 replies

redefining for loop variable semantics #56010

rsc Oct 3, 2022 Maintainer

A decade of experience shows the cost of the current semantics

Old code is unaffected, compiling exactly as before

Changing the semantics is usually a no-op, and when it’s not, it fixes buggy code far more often than it breaks correct code

Good tooling can help users identify exactly the loops that need the most scrutiny during the transition

Static analysis is not a viable alternative

Changing loop syntax entirely would cause unnecessary churn

Replies: 50 comments · 241 replies

rsc Oct 3, 2022 Maintainer Author

thepudds Oct 3, 2022 Collaborator

rsc Oct 3, 2022 Maintainer Author

vendored dependencies

rsc Oct 3, 2022 Maintainer Author

rsc Oct 3, 2022 Maintainer Author

cespare Oct 3, 2022 Collaborator

rsc Oct 3, 2022 Maintainer Author

dr2chase Oct 3, 2022 Maintainer

hyangah Oct 3, 2022 Maintainer

rsc Oct 4, 2022 Maintainer Author

rsc Oct 4, 2022 Maintainer Author

rsc Oct 4, 2022 Maintainer Author

mattn Oct 4, 2022 Collaborator

rsc Oct 4, 2022 Maintainer Author

rsc
Oct 3, 2022
Maintainer

Replies: 50 comments 241 replies

rsc Oct 3, 2022
Maintainer Author

thepudds
Oct 3, 2022
Collaborator

rsc Oct 3, 2022
Maintainer Author

rsc Oct 3, 2022
Maintainer Author

rsc Oct 3, 2022
Maintainer Author

cespare
Oct 3, 2022
Collaborator

rsc Oct 3, 2022
Maintainer Author

dr2chase Oct 3, 2022
Maintainer

hyangah
Oct 3, 2022
Maintainer

rsc Oct 4, 2022
Maintainer Author

rsc Oct 4, 2022
Maintainer Author

rsc Oct 4, 2022
Maintainer Author

mattn
Oct 4, 2022
Collaborator

rsc Oct 4, 2022
Maintainer Author