diff --git a/eeps/eep-0072.md b/eeps/eep-0072.md deleted file mode 100644 index 6868e0e..0000000 --- a/eeps/eep-0072.md +++ /dev/null @@ -1,251 +0,0 @@ - Author: Jesse Gumm - Status: Accepted - Type: Standards Track - Created: 08-Oct-2024 - Post-History: -**** -EEP 72: Reserved words and Variables as record names, and enhancement to definition syntax ----- - -Abstract -======== - -This EEP loosens some of the restrictions around record names to make it no -longer necessary to quote them when they are named with reserved words (`#if` -vs `#'if'`) or words with capitalized first characters (terms that currently -would be treated as variables, for example `#Hello` vs `#'Hello'`). - -This EEP also proposes to add a new record-like syntax to the record -definitions (also adopting the above syntactical changes), so that the -following record definitions would be valid and identical: - -```erlang --record('div', {a :: integer(), b :: integer()}). --record #div{a :: integer(), b :: integer()}. -``` - -The latter one is proposed new syntax. The following would also be valid and -identical since parentheses are optional in attributes, and since atoms may be -quoted even when not mandatory: - -```erlang --record 'div', {a :: integer(), b :: integer()}. --record #'div'{a :: integer(), b :: integer()}. --record(#'div'{a :: integer(), b :: integer()}). --record(#div{a :: integer(), b :: integer()}). -``` - -Usage Syntax Motivation -======================= - -Record names are atoms. As such, the current Erlang syntax requires the record -names to be consistent with the rest of the language's use of atoms. - -All atoms in Erlang can be denoted with single quotes. Some examples: - -For example: - -```erlang -'foo'. -'FOO'. -'foo-bar'. -``` - -But, conveniently, simple atoms (all alphanumeric, underscores (`_`) or -at symbols (`@`) with the first character being a lowercase letter and not one -of the 20+ reserved words), in all contexts can be invoked without the -necessary wrapping quotes. Some examples: - -```erlang -foo. -foo_Bar. -'foo-bar'. % still quoted since the term has a non-atomic character in it. -``` - -Conveniently, this also means that records named with simple atoms can be -invoked and used without having to quote the atoms. For example: - -```erlang --record(foo, {a, b}). --record(bar, {c}). - -go() -> - X = #foo{a = 1, b = 2}, - Y = X#foo{a = something_else}, - Z = #bar{c = Y#foo.a}, - ... -``` - -Unfortunately, that also means that records named with anything that doesn't -fit the "simple atom" pattern must be wrapped in quotes in definition and -usage. For example: - -```erlang --record('div', {a, b}). --record('SET', {c}). - -go() -> - X = #'div'{a = 1, b = 2}, - Y = X#'div'{a = something_else}, - Z = #'SET'{c = Y#'div'.a}, - ... -``` - -While this approach is consistent with atom usage in the language, for reserved -words and capitalized atoms, this makes the record syntax *feel* inconsistent if you have a need for -naming a record with a reserved word (or term with a capital first letter). In -this case, it almost guarantees a user won't use a record named 'if', -'receive', 'fun', etc even though there may very well be a valid use case for -such a name. The most common use case that comes to mind from the Nitrogen Web -Framework. Since HTML has a `div` tag, Nitrogen (which represents HTML tags -using Erlang records) should naturally have a `#div` record, however, due to -'div' being a reserved word (the integer division operator), the record `#panel` -is used instead to save the programmer from having to invoke `#'div'`, -which feels unnatural and awkward. - -Further, applications such as ASN.1 and Corba both have naming conventions that -rely heavily on uppercase record names and as such, they currently must be -quoted as well. You can see this in modules in Erlang's -[`asn1`](https://github.com/erlang/otp/blob/OTP-27.1.1/lib/asn1/src/asn1_records.hrl#L35-L39) -application. (The previous link points to some record definitions in `asn1`, -but you can see the usage scattered across a number of modules in the `asn1` -application). - -Usage Syntax Specification -========================== - -This EEP simplifies the above example by - -1. Allowing reserved words and variables to be used without quotes for record - names, and -2. Simplifying the definition such that the syntax between record definition - and record usage becomes more consistent. - -With the changes from this EEP, the above code becomes: - -```erlang --record('div', {a, b}). --record('SET', {c}). - -go() -> - X = #div{a = 1, b = 2}, - Y = X#div{a = something_else}, - Z = #SET{c = Y#div.a}, - ... -``` - -Definition Syntax Motivation -============================ - -While the updated example in the usage syntax specification makes the *using* -of records cleaner, there remains one more inconsistency that can also be -relatively easily solved. That is the record definition still also needing to -quote record name, as the example above demonstrates (repeated here for -convenience): - -```erlang --record('div', {a, b}). - -go() -> - X = #div{a = 1, b = 2}, - Y = X#div{a = something_else}, - Z = Y#div.a, - ... -``` - -So whereas the record definition needs to be thought of as `'div'`, the record -usage no longer requires the quoted term 'div', which could certainly lead an -Erlang beginner to wonder why 'div' needs to be quoted in the definition while -other atom-looking terms don't. - -Definition Syntax Specification -=============================== - -Conveniently, there is a rather easy solution, and that's to -allow the record usage syntax to also be used as the record definition. - -This EEP also then also adds a new record definition syntax, improving the -symmetry between general record usage and record definition. - -The above example can fully then look like the following: - -```erlang --record #div{a, b}. - -go() -> - X = #div{a = 1, b = 2}, - Y = X#div{a = something_else}, - Z = Y#div.a, - ... -``` - -Implementation -============== - -To update the syntax for using records, we can safely augment the parser to -change its already existing record handling of `'#' atom '{' ... '}'` and -`'#'atom '.' atom` into `'#' record_name '{' ... '}'` and -`'#' record_name '.' atom`, and define `record_name` to be `atom`, `var`, or -`reserved_word`. - -To update the record definition syntax, we can simply add a few new -modifications to the `attribute` Nonterminal to allow `'#' record_name` as name -for the `record` attribute, instead of `atom` as for generic attributes. - -Backwards Compatibility -======================= - -As this EEP only adds new syntax, the vast majority existing codebases will -still work, with the possible exception of AST/code analysis tools that are -analyzing code using the new syntax. - -Syntax highlighting and code completion tools may need to be updated to support -the new syntax if your code uses the new syntax rules. - -Broader Concerns and Points of Discussion -========================================= - -While the new definition syntax creates some degree of symmetry around record -usage, perfect symmetry is impossible to achieve, since a record can always -be handled as the atom tagged tuple it actually is. The question is where -to draw the line where the record's true nature shows, and how hard we -should try to hide it. These are remaining concerns and inconsistencies: - -Auxiliary Record Functions --------------------------- - -Other functions that work with records like `is_record/2` or `record_info/1` -are not currently covered by any of the syntactical changes in this EEP, and as -such, it remains necessary to quote record names if they are not simple atoms. -For example: `is_record(X, div)` would still be a syntax error. So there is -still not true 100% symmetry. Note that instead of using the -`is_record(X, 'div')` guard, matching on `#div{}` is probably more frequently -used, since it is terser and mostly regarded as more readable. - -Two Definition Syntaxes? ------------------------- - -This EEP introducing a new syntax for record definition could potentially lead -to some to wonder why the language has two rather different syntaxes for -defining records. Since usage of the syntax for getting, setting, matching, etc -(e.g. `#rec{a=x,y=b}`) occurs far more commonly than defining, it only feels -natural that the definition syntax would mirror usage. - -For more symmetry, the syntax in Erlang's type system to define records also -matches the newly proposed define syntax. - -Thus, I feel that sharing the existing usage and type syntax with the -definition system would likely become the default/preferred way, and that the -original syntax remain for backwards compatibility. - -Reference Implementation -======================== - -The reference implementation is provided in a form of pull request on GitHub - - https://github.com/erlang/otp/pull/7873 - -Copyright -========= - -This document has been placed in the public domain. diff --git a/eeps/eep-0074-1.png b/eeps/eep-0074-1.png new file mode 100644 index 0000000..2584e2d Binary files /dev/null and b/eeps/eep-0074-1.png differ diff --git a/eeps/eep-0074.md b/eeps/eep-0074.md new file mode 100644 index 0000000..d1f6575 --- /dev/null +++ b/eeps/eep-0074.md @@ -0,0 +1,249 @@ + Author: Roberto Aloi + Status: Draft + Type: Standards Track + Created: 11-Nov-2024 + Erlang-Version: OTP-28 + Post-History: +**** +EEP 74: Erlang Error Index +---- + +Abstract +======== + +The **Erlang Error Index** is a _catalogue_ of errors emitted by +various tools within the Erlang ecosystem, including - but not limited +to - the `erlc` Erlang compiler and the `dialyzer` type checker. + +The catalogue is not limited to tools shipped with Erlang/OTP, but it +can include third-party applications such as the [EqWAlizer][] +type-checker or the [Elvis][] code style reviewer. + +Each error in the catalogue is identified by a **unique code** +and it is accompanied by a description, examples and possible courses +of action. Error codes are _namespaced_ based on the tool that +generates them. Unique codes can be associated to a human-readable +**alias**. + +Unique error codes can be leveraged by IDEs and language servers to +provide better contextual information about errors and make errors +easier to search and reference. A standardized error index creates a +common space for the Community to provide extra examples and +documentation, creating the perfect companion for the Erlang User +Manual and standard documentation. + +Rationale +========= + +The concept of an "Error Index" for a programming language is not a +novel idea. Error catalogues already exist, for example, in the +[Rust][] and [Haskell][] Communities. + +Producing meaningful error messages can sometimes be challenging for +developer tools such as compilers and type checkers due to various +constraints, including limited context and character count. + +By associating a **unique code** to each _diagnostic_ (warning or +error) we relief tools from having to condense a lot of textual +information into a - sometime cryptic - generic, single +sentence. Furthermore, as specific wording of errors and warnings is +improved over time, error codes remain constant, providing a +search-engine friendly way to index and reference diagnostics. + +An good example of this is the _expression updates a literal_ error +message, introduced in OTP 27. Given the following code: + + -define(DEFAULT, #{timeout => 5000}). + + updated(Value) -> + ?DEFAULT#{timeout => Value}. + +The compiler emits the following error: + + test.erl:8:11: Warning: expression updates a literal + % 8| ?DEFAULT#{timeout => 1000}. + % | ^ + +The meaning of the error may not be obvious to everyone. Most +importantly, the compiler provide no information on why the warning is +raised and what a user could do about it. The user will then have to +recur to a search engine, a forum or equivalent to proceed. + +Conversely, we can associate a unique identifier to the code (say, +`ERL-1234`): + + test.erl:8:11: Warning: expression updates a literal (ERL-1234) + % 8| ?DEFAULT#{timeout => 1000}. + % | ^ + +The code make it possible to link the error message to an external +resource (e.g. a wiki page), which contains all the required, +additional, information about the error that would not be practical to +present directly to the user. Here is an example of what the entry +could look like for the above code: + +![Erlang Error Index Sample Entry][] + +Unique error codes also have the advantage to be better searchable in +forums and chats, where the exact error message could vary, but the +error code would be the same. + +Finally, error codes can be used by IDEs (e.g. via language servers) +to match on error codes and provide contextual help. Both the [Erlang +LS][] and the [ELP][] language server already use "unofficial" error +codes. + +Emitting Diagnostics +-------------------- + +To make it easier for language servers and IDEs, tools producing +diagnostics should produce diagnostics (errors and warnings) in a +standardized format. In the case of the compiler, this could be done +by specifying an extra option (e.g. `--error-format json`). + +A possible JSON format, heavily inspired by the [LSP protocol][], is: + +``` +{ + uri: "file:///git/erlang/project/app/src/file.erl", + range: { + start: { + line: 5, + character: 23 + }, + end: { + line: 5, + character: 32 + } + }, + severity: "warning", + code: "DIA-1234", + doc_uri: "https://errors.erlang.org/DIA/DIA-1234", + source: "dialyzer", + message: "This a descriptive error message from Dialyzer" +} +``` + +Where: + +``` + +* **uri**: The path of the file the diagnostic refers to, expressed using the [RFC-3986][] format +* **range**: The range at which the message applies, zero-based. The range should be as strict as possible. For example, if warning +the user that a record is unused, the range of the diagnostic should +only cover the name of the record and not the entire definition. This +minimizes the distraction for the user when, for example, rendered as +a squiggly line, while conveying the same information. +* **severity**: The diagnostic's severity. Allowed values are `error`, `warning`, `information`, `hint`. +* **code**: A unique error code identifying the error +* **doc_uri**: A URI to open with more information about the diagnostic error +* **source**: A human-readable string describing the source of the diagnostic +* **message**: A short, textual description of the error. The message should be general enough and make sense in isolation. +``` + +Error Code Format +----------------- + +An error code should be composed by two parts: an alphanumeric +_namespace_ (three letters) and a numeric identifier (four digits), +divided by a dash (`-`). + +A potential set of namespaces could look like the following: + +| Namespace | Description | +|-----------|-----------------------------------------------------------------| +| ERL | The Erlang compiler and related tools (linter, parser, scanner) | +| DIA | The Dialyzer type-checker | +| ELV | The Elvis code-style reviewer | +| ELP | The Erlang Language Platform | +| ... | ... | + +A set of potential error codes could look like: + + ERL-0123 + DIA-0009 + ELV-0015 + ELP-0001 + +The exact number of characters/digits for each namespace and code is +open for discussion, as well as the fact whether components such as +the parser, the scanner or the `erlint` Erlang linter should have +their own namespace. + +Responsibilities +---------------- + +The Erlang/OTP team would be ultimately responsible for maintaining a +list of _official_ namespaces. Each tool maintainer would then be +responsible to allocate specific codes to specific diagnostics. + +Processes +--------- + +The error index can be implemented in the format of Markdown pages. The +approval process for a namespace (or an error code) will follow a +regular flow using a Pull Request, reviewed and approved by the +Erlang/OTP team and, potentially, other interested industrial members. + +Errors cannot be re-used. If a tool stops emitting an error code, the +_deprecated_ error code is still documented in the index, together +with a deprecation notice. This is to avoid re-using a single code for +multiple purposes. + +To limit the administration burden, the section will contain only +error codes for the tools shipped with Erlang/OTP and the namespaces +for external tools. Individual error codes for each namespace would be +managed by the respective owners. + +Reference Implementation +------------------------ + +The [ELP website][] contains a proof of concept of what an Erlang +Error Index could look like. Ideally, such a website would live under +the `erlang.org` domain, e.g. using the `https://errors.erlang.org/` URL. + +The website should use _Markdown_ as the primary mechanism to write +content and it should be easily extensible by the Community. + +Copyright +========= + +This document is placed in the public domain or under the CC0-1.0-Universal +license, whichever is more permissive. + +[EqWAlizer]: https://github.com/whatsapp/eqwalizer + "The EqWAlizer Type Checker" + +[Elvis]: https://github.com/inaka/elvis + "The Elvis Style Reviewer" + +[Rust]: https://doc.rust-lang.org/error_codes/error-index.html + "The Rust Error Index" + +[Haskell]: https://errors.haskell.org + "The Haskell Error Index" + +[Erlang Error Index Sample Entry]: eep-0074-1.png + "Erlang Error Index Sample Entry" + +[Erlang LS]: https://github.com/erlang-ls/erlang_ls/blob/a4a12001e36b26343d1e9d57a0de0526d90480f2/apps/els_lsp/src/els_compiler_diagnostics.erl#L237 + "Erlang LS using error codes" + +[ELP]: https://github.com/WhatsApp/erlang-language-platform/blob/99a426772be274f3739116736bb22d4c98c123c4/erlang_service/src/erlang_service.erl#L608 + "ELP using error codes" + +[ELP Website]: https://whatsapp.github.io/erlang-language-platform/docs/erlang-error-index/ + "ELP website" + +[LSP Protocol]: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#diagnostic + +[RFC-3986]: https://datatracker.ietf.org/doc/html/rfc3986 + +[EmacsVar]: <> "Local Variables:" +[EmacsVar]: <> "mode: indented-text" +[EmacsVar]: <> "indent-tabs-mode: nil" +[EmacsVar]: <> "sentence-end-double-space: t" +[EmacsVar]: <> "fill-column: 70" +[EmacsVar]: <> "coding: utf-8" +[EmacsVar]: <> "End:" +[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "