Skip to content

Commit 8489d8d

Browse files
Adds "Legacy Syntax Editions" to the GitHub code repository.
PiperOrigin-RevId: 574248480
1 parent cbef12f commit 8489d8d

File tree

2 files changed

+145
-0
lines changed

2 files changed

+145
-0
lines changed

docs/design/editions/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,4 @@ The following topics are in this repository:
3535
* [Edition Evolution](edition-evolution.md)
3636
* [Edition Naming](edition-naming.md)
3737
* [Editions Feature Visibility](editions-feature-visibility.md)
38+
* [Legacy Syntax Editions](legacy-syntax-editions.md)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Legacy Syntax Editions
2+
3+
**Author:** [@mkruskal-google](https://github.com/mkruskal-google)
4+
5+
**Approved:** 2023-09-08
6+
7+
Should proto2/proto3 be treated as editions?
8+
9+
## Background
10+
11+
[Edition Zero Features](edition-zero-features.md) lays out our plan for edition
12+
2023, which will unify proto2 and proto3. Since early in the design process,
13+
we've discussed the possibility of making proto2 and proto3 "special" editions,
14+
but never laid out what exactly it would look like or determined if it was
15+
necessary.
16+
17+
We recently redesigned editions to be represented as enums
18+
([Edition Naming](edition-naming.md)), and also how edition defaults are
19+
propagated to generators and runtimes
20+
([Editions: Life of a FeatureSet](editions-life-of-a-featureset.md)). With these
21+
changes, there could be an opportunity to special-case proto2 and proto3 in a
22+
beneficial way.
23+
24+
## Problem Description
25+
26+
While the original plan was to keep editions and syntax orthogonal, that naively
27+
means we'd be supporting two very different codebases. This has some serious
28+
maintenance costs though, especially when it comes to test coverage. We could
29+
expect to have sub-optimal test coverage of editions initially, which would
30+
gradually become poor coverage of syntax later. Since we need to support both
31+
syntax and editions long-term, this isn't ideal.
32+
33+
In the implementation of editions in C++, we decided to unify a lot of the
34+
infrastructure to avoid this issue. We define global feature sets for proto2 and
35+
proto3, and try to use those internally instead of checking syntax directly. By
36+
pushing the syntax/editions branch earlier in the stack, it gives us a lot of
37+
indirect test coverage for editions much earlier.
38+
39+
A separate issue is how Prototiller will support the conversion of syntax to
40+
edition 2023. For features it knows about, we can hardcode defaults into the
41+
transforms. However, third party feature owners will have no way of signaling
42+
what the old proto2/proto3 behavior was, so Prototiller won't be able to provide
43+
any transformations by default. They'd need to provide custom Prototiller
44+
transforms hardcoding all of their features.
45+
46+
## Recommended Solution
47+
48+
We recommend adding two new special editions to our current set:
49+
50+
```
51+
enum Edition {
52+
EDITION_UNKNOWN = 0;
53+
EDITION_PROTO2 = 998;
54+
EDITION_PROTO3 = 999;
55+
EDITION_2023 = 1000;
56+
}
57+
```
58+
59+
These will be treated the same as any other edition, except in our parser which
60+
will reject `edition = "proto2"` and `edition = "proto3"` in proto files. The
61+
real benefit here is that this allows features to specify what their
62+
proto2/proto3 defaults are, making it easier for Prototiller to handle
63+
migration. It also allows generators and runtimes to unify their internals more
64+
completely, treating proto2/proto3 files exactly the same as editions.
65+
66+
### Serialized Descriptors
67+
68+
As we now know, there are a lot of serialized `descriptor.proto` descriptor sets
69+
out there that need to continue working for O(months). In order to avoid
70+
blocking edition zero for that long, we may need fallbacks in protoc for the
71+
case where feature resolution *fails*. If the file is proto2/proto3, failure
72+
should result in a fallback to the existing hardcoded defaults. We can remove
73+
these later once we're willing to break stale `descriptor.proto` snapshots that
74+
predate the changes in this doc.
75+
76+
### Bootstrapping
77+
78+
In order to get feature resolution running in proto2 and proto3, we need to be
79+
able to support bootstrapped protos. For these builds, we can't use any
80+
reflection without deadlocking, which means feature defaults can't be compiled
81+
during runtime. We would have had to solve this problem anyway when it came time
82+
to migrate these protos to editions, but this proposal forces our hand early.
83+
Luckily, "Editions: Life of a FeatureSet" already set us up for this scenario,
84+
and we have Blaze rules for embedding these defaults into code. For C++
85+
specifically, this will need to be checked in alongside the other bootstrapped
86+
protos. Other languages will be able to do this more dynamically via genrules.
87+
88+
### Feature Inference
89+
90+
While we can calculate defaults using the same logic as in editions, actually
91+
inferring "features" from proto2/proto3 needs some custom code. For example:
92+
93+
* The `required` keyword sets `LEGACY_REQUIRED` feature
94+
* The `optional` keyword in proto3 sets `EXPLICIT` presence
95+
* The `group` keyword implies `DELIMITED` encoding
96+
* The `enforce_utf8` options flips between `PACKED` and `EXPANDED` encoding
97+
98+
This logic needs to be written in code, and will need to be duplicated in every
99+
language we support. Any language-specific feature transformations will also
100+
need to be included in that language. To make this as portable as possible, we
101+
will define functions like:
102+
103+
Each type of descriptor will have its own set of transformations that should be
104+
applied to its features for legacy editions.
105+
106+
#### Pros
107+
108+
* Makes it clearer that proto2/proto3 are "like" editions
109+
110+
* Gives Prototiller a little more information in the transformation from
111+
proto2/proto3 to editions (not necessarily 2023)
112+
113+
* Allows proto2/proto3 defaults to be specified in a single location
114+
115+
* Makes unification of syntax/edition code easier to implement in runtimes
116+
117+
* Allows cross-language proto2/proto3 testing with the conformance framework
118+
mentioned in "Editions: Life of a FeatureSet"
119+
120+
#### Cons
121+
122+
* Adds special-case legacy editions, which may be somewhat confusing
123+
124+
* We will need to port feature inference logic across all languages. This is
125+
arguably cheaper than maintaining branched proto2/proto3 code in all
126+
languages though
127+
128+
## Considered Alternatives
129+
130+
### Do Nothing
131+
132+
If we do nothing, there will be no built-in unification of syntax and editions.
133+
Runtimes could choose any point to split the logic.
134+
135+
#### Pros
136+
137+
* Requires no changes to editions code
138+
139+
#### Cons
140+
141+
* Likely results in lower test coverage
142+
* May hide issues until we start rolling out edition 2023
143+
* Prototiller would have to hard-code proto2/proto3 defaults of features it
144+
knows, and couldn't even try to migrate runtimes it doesn't

0 commit comments

Comments
 (0)