Skip to content

Conversation

@gabriel-fallen
Copy link
Contributor

Closes #1146

@gabriel-fallen gabriel-fallen added this to the Ownership transfer milestone Dec 2, 2025
@gabriel-fallen gabriel-fallen self-assigned this Dec 2, 2025
@gabriel-fallen gabriel-fallen added the scope: tvm /tvm folder: TON Virtual Machine label Dec 2, 2025
@github-actions

This comment was marked as outdated.

@gabriel-fallen gabriel-fallen changed the title Explain the basics of instruction encoding and notation fix: Explain the basics of instruction encoding and notation Dec 2, 2025
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates to tvm/instructions.mdx: I’ve left several suggestions to align tone and phrasing with the style guide, so please apply the inline suggestions.

mode: "wide"
---

The notation employed in the table below, as well as general information on the binary encoding of instructions pertaining to opcodes and immediate arguments, is explained in the [Notation](#notation) section following the table.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Throat-clearing introductory sentence before table

Line 7 introduces the section by describing the document structure (“the table below” and “section following the table”) rather than conveying TVM semantics or notation directly. This matches the style guide’s prohibition on throat-clearing and meta-commentary that talks about the doc instead of delivering value. The sentence can be rewritten to state what the notation explains without referencing layout or navigation. This keeps the introduction focused on TVM instruction encoding semantics rather than page structure.

Suggested change
The notation employed in the table below, as well as general information on the binary encoding of instructions pertaining to opcodes and immediate arguments, is explained in the [Notation](#notation) section following the table.
The [notation](#notation) section explains how the instruction table encodes TVM instruction opcodes and immediate arguments in binary.

Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone!


## Notation

TVM instructions are encoded by sequences of bits, not all the same length, but a multiple of a byte in total. The immediate arguments essentially form a part of the instruction and have no special demarcation in a bitstream. This leads to some instructions sharing the same opcode _prefix_. For instance, the `NOP` instruction has the (full) opcode of `0x00`, meaning the null byte, eight consequtive zero bits. At the same time the `XCHG_0I` family of instructions starts with `0x0`—four consequtive zero bits—and continue with a four-bit immediate argument ranging from `0x1` to `0xF`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Use of hedge word “essentially”

Line 15 contains the phrase “The immediate arguments essentially form a part of the instruction,” which introduces the hedge word “essentially.” The extended style guide explicitly calls out hedges and intensifiers like this as [HIGH]-severity issues that should be replaced with concrete, factual statements. Removing “essentially” leaves a precise factual claim about how immediate arguments participate in instruction encoding. This improves clarity and aligns with the rule to avoid unnecessary softening language.

Suggested change
TVM instructions are encoded by sequences of bits, not all the same length, but a multiple of a byte in total. The immediate arguments essentially form a part of the instruction and have no special demarcation in a bitstream. This leads to some instructions sharing the same opcode _prefix_. For instance, the `NOP` instruction has the (full) opcode of `0x00`, meaning the null byte, eight consequtive zero bits. At the same time the `XCHG_0I` family of instructions starts with `0x0`—four consequtive zero bits—and continue with a four-bit immediate argument ranging from `0x1` to `0xF`.
TVM instructions are encoded by sequences of bits, not all the same length, but a multiple of a byte in total. The immediate arguments form a part of the instruction and have no special demarcation in a bitstream. This leads to some instructions sharing the same opcode _prefix_. For instance, the `NOP` instruction has the (full) opcode of `0x00`, meaning the null byte, eight consequtive zero bits. At the same time the `XCHG_0I` family of instructions starts with `0x0`—four consequtive zero bits—and continue with a four-bit immediate argument ranging from `0x1` to `0xF`.

Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone!


The `s[i]` notation refers to the `i`-th stack slot counting from the top, and the top being the 0th slot. Particular stack slots are refernced directly as `s0`, `s1` and so forth in TASM, FIFT and documentation, and are encoded simply by index in the binary.

The `[32(c+1)] PLDUZ` notation means you need to pick a value for `c`, make the calculation and substitute the result. Say, you pick the value `2`, then you write the instruction as `96 PLDUZ` in FIFT. The reason is that `96` is the actual number of bits to read, while in the bitstream only the value for `c` is stored, while the TVM performs the calculation on its own.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Direct address to the reader with “you”

Line 21 explains [32(c+1)] PLDUZ using direct second-person address (“means you need to pick a value for c … then you write the instruction”), which violates the style rule against “you/your” for the reader and “we/I/our” for the author. The style guide marks this as a [HIGH]-severity issue and recommends neutral or impersonal phrasing instead. Rewriting the explanation to describe the procedure without addressing the reader directly keeps the content instructional while conforming to tone guidelines. This also makes the description more suitable for reference documentation.

Suggested change
The `[32(c+1)] PLDUZ` notation means you need to pick a value for `c`, make the calculation and substitute the result. Say, you pick the value `2`, then you write the instruction as `96 PLDUZ` in FIFT. The reason is that `96` is the actual number of bits to read, while in the bitstream only the value for `c` is stored, while the TVM performs the calculation on its own.
The `[32(c+1)] PLDUZ` notation means a value for `c` is chosen, the calculation is performed, and the result is substituted. For example, with `c = 2`, the instruction is written as `96 PLDUZ` in FIFT. The value `96` is the actual number of bits to read, while the bitstream stores only the value for `c`, and the TVM performs the calculation on its own.

Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope: tvm /tvm folder: TON Virtual Machine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TVM > Instructions] Explain notation

3 participants