Skip to content

Are we inserting yk locations at the right places? #109

@vext01

Description

@vext01

Last week I was making changes to yklua and noticed something strange to do with our yk locations. I think we are inserting them in the wrong places!

This may be part of the reason we see strange performance and/or "trace too long", but I haven't investigated further (yet).

Diff to yklua:

diff --git a/src/lcode.c b/src/lcode.c
index 6df4430..f3b402c 100644
--- a/src/lcode.c
+++ b/src/lcode.c
@@ -374,6 +374,7 @@ static void removelastinstruction (FuncState *fs) {
   fs->pc--;
 }

+#include <stdio.h>

 /*
 ** Emit instruction 'i', checking for array sizes and saving also its
@@ -391,10 +392,16 @@ int luaK_code (FuncState *fs, Instruction i) {
   luaM_growvector(fs->ls->L, f->yklocs, pc, f->sizeyklocs, YkLocation,
                   MAX_INT, "yklocs");
   f->yklocs[pc] = yk_location_null();
-  if ((GET_OPCODE(i) == OP_JMP) && (GETARG_sJ(i) < 0))
+  if ((GET_OPCODE(i) == OP_JMP) && (GETARG_sJ(i) < 0)) {
     f->yklocs[pc + GETARG_sJ(i)] = yk_location_new();
-  if (GET_OPCODE(i) == OP_FORLOOP)
+    // +1 so it matches luac output (1-based).
+    fprintf(stderr, "loop start (jmp) at %d (discovered via %d, sJ=%d)\n", pc + GETARG_sJ(i) + 1, pc + 1, GETARG_sJ(i));
+  }
+  if (GET_OPCODE(i) == OP_FORLOOP) {
     f->yklocs[pc - GETARG_Bx(i) - 2] = yk_location_new();
+    // +1 so it matches luac output (1-based).
+    fprintf(stderr, "loop start (forloop) at %d (discovered via %d, Bx=%d)\n", pc - GETARG_Bx(i) - 2 + 1, pc + 1, GETARG_Bx(i));
+  }
 #endif
   savelineinfo(fs, f, fs->ls->lastline);
   return pc;  /* index of new instruction */

^ This just prints:

  • the pc of places we deem loop starts
  • the pc of the bytecode that lead us to conclude that it's a loop start
  • the lua bytecode instruction argument of interest

for loops

for_loop.lua:

for i = 0, 3 do
    print(i)
end

Then if we run luac we see the output of the above diff and theh disassembled bytcodes:

./src/luac -l for_loop.lua
loop start (forloop) at 7 (discovered via 9, Bx=0)

main <for_loop.lua:0,0> (10 instructions at 0x111b31d0)
0+ params, 6 slots, 1 upvalue, 4 locals, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 0
        3       [1]     LOADI           1 3
        4       [1]     LOADI           2 1
        5       [1]     FORPREP         0 3     ; exit to 10
        6       [2]     GETTABUP        4 0 0   ; _ENV "print"
        7       [2]     MOVE            5 3
        8       [2]     CALL            4 2 1   ; 1 in 0 out
        9       [1]     FORLOOP         0 4     ; to 6
        10      [3]     RETURN          0 1 1   ; 0 out

The lua disassembler says the loop start is at pc=6, but we put it at pc=7. Note the jump operand is zero.

while loops

while loops are worse.

while_loop.lua:

local i  = 3
while i ~= 0 do
    i = i - 1
    print(i)
end
$ ./src/luac -l while_loop.lua
loop start (jmp) at 3 (discovered via 4, sJ=-1)
loop start (jmp) at 9 (discovered via 10, sJ=-1)

main <while_loop.lua:0,0> (11 instructions at 0x240211d0)
0+ params, 3 slots, 1 upvalue, 1 local, 1 constant, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     LOADI           0 3
        3       [2]     EQI             0 0 1
        4       [2]     JMP             6       ; to 11
        5       [3]     ADDI            0 0 -1
        6       [3]     MMBINI          0 1 7 0 ; __sub
        7       [4]     GETTABUP        1 0 0   ; _ENV "print"
        8       [4]     MOVE            2 0
        9       [4]     CALL            1 2 1   ; 1 in 0 out
        10      [4]     JMP             -8      ; to 3
        11      [5]     RETURN          1 1 1   ; 0 out

Here we detect both JMPs as loop starts, even though only one is a backward jump. By luck we correctly end up marking pc=3 as a loop start, but we also deem pc=9 as a loop start, which isn't right.

The bytecode operand is -1 for both jumps.

A theory I will investigate is that the operands are not fully "initialised" at this point in the system and the -1 (and maybe the 0 for for loops above) is a placeholder to be computed later. The solution may be to move the creation of the yk locations later into (e.g.) close_func() in the parser.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions