-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse bug: sometimes fails on valid code (triggers assertion failure in coder.lua) #508
Comments
Spooky! This assertion failure is happening inside the code generator (coder.lua), not in the parser. Certainly a bug. Thanks for bug report. Based on the line number, it might be related to the managing of upvalues. |
I have additional information. I discovered that invoking the Pallene compiler with the -O0 option avoids hitting this failure with the previously attached example.
Since the error seems to occur before the C compiler gets involved, this makes me believe Pallene has different code paths for different optimization levels, and there is a bug in the optimized code generator. |
Based on your tip about the The purpose of the constant propagation pass is to optimize code that has constant toplevel variables. For example, something like this:
is converted into something equivalent to this:
In the process, the toplevel constant variable is deleted and the internal variable IDs are renumbered accordingly. I expect the bug might be one of these: If this bug is where I think it is, I have good reason to believe that we should be able to reproduce it with a minimal Pallene program. Once we do that, the bug will reveal itself to us. |
Looks like this gets the same problem: local m: module = {}
local N = 42
local function f()
end
local function g()
local x = N
f()
end
return m |
Good news, I think I have figured this one out. Looks like the problem was (c). The f_id_of_upvalue list isn't being updated. I'll see what I can do to fix that. |
Fixes #508. In the long term, we should look for ways to get rid of the f_id_of_upvalue altogether. If we had an analysis pass that computed it, we wouldn't need to maintain and update it across other compilation passes.
Fixes #508. In the long term, we should look for ways to get rid of the f_id_of_upvalue altogether. If we had an analysis pass that computed it, we wouldn't need to maintain and update it across other compilation passes.
Leaving this open for now, until we get confirmation from @ewmailing. |
Sorry I am slow at responding. I did a spot check compile with a few of my files that I think were affected by this and they seem to compile now. So I think this is resolved. When I had a bit more time, I was going to chime in and ask you more about the feature issue about . I was surprised to learn you had some form of constant propagation already separate from support. But I had inspected some of the generated C code from Pallene awhile ago and saw some areas where I think support could ultimately produce more optmized code. I've been planning to ask you about going modifying the parser (once I find more time). |
Nice! Good to know that thing compiles on your end too! Would you mind telling a bit more about what parser things you have in mind? :) Over here the next thing I am working on is making pallene installable via Luarocks. That should finally get rid of that long standing problem where the compiler only works if you call it from the root directory of the repository... I'm also starting to think about introducing SSA to the intermediate representation. It would help implement more powerful optimizations, including smarter constant propagation. About constant propagation: currently, the idea is we only propagate toplevel constants, so in the inner functions they become local vars instead of upvalues. We don't do any propagation inside each function after this. The idea is to let gcc do that. It works well for numeric constants, but not so well for strings and other Lua things. |
Nice to hear about removing the root directly problem. I hope this solution will also allow for embedded use. Sorry, I just realized that Github's text system is dropping things with angle brackets if you don't escape. I may have written So my original thinking (and not knowing that you had this other feature) was that I tend to declare global constants at the top of the file, and use them throughout functions in the rest of the file. Consider this basic hypothetical example:
I was not using So if we just focus on primitive values, I was thinking that the C generator could emit something like:
Then the rest of the emitted C code that uses TAU could remain roughly the same presuming it just directly refers to the variable TAU (or whatever Pallene renamed the TAU variable to). Then the C compiler should be able to optimize the |
The version I'm working with right now is geared towards installing with Luarocks. It makes some assumptions about that, for example the Pallene runtime library is installed to the LUA_PATH and the Pallene modules "require" it at startup. However, the good news is that I think it will be easier to port it to other settings. One of the other things I did is that we no longer will need to pass special "-I" flags when compiling the generated C code.
One thing that might be confusing is that our constant folding only substitutes the uses of the constant. It doesn't touch the definition of the constant, which becomes dead code (#530). That might give the impression the constant folding didn't work, even if it did. By the way, for this particular question it might be easier to look at the
Unfortunately, our constant folding is too dumb and only works if the constant is exactly a constant literal (#531). Right now, a multiplication is too clever for it. The blocker for fixing this is that we will need to enrich the IR internals to allow non-trivial optimizations. I'm more motivated than ever to do that, specially after having to fix this bug. It was something that could have been avoided if we had an optimization pass for the CallStatic... Anyway, once we have the infrastructure, implementing "proper" constant folding will become low-hanging fruit. |
I stumbled on a case where it appears the parser sometimes fails depending on where I place a particular (valid) line of code.
Attached is a reproducible test case. There are two files, parsebug_good.pln and parsebug_bad.pln.
The only difference between the two is where the following line is placed in the code:
local macd_plot = m.NewAccessoryPlot(backTest, "MACD", 1280, 250)
When placed around line 112 (parsebug_good.pln), Pallene compiles this code just fine.
But if the line is moved to around line 122 (parsebug_bad.pln), Pallene produces the following assertion error:
Using --emit-lua on parsebug_bad.lua does not produce any error, so it seems to be just the C generation side.
I don't have a good sense of why this is happening. I originally encountered it by just starting to add some lines to my program. When I started ripping things out to try to isolate it, the problem would go away. Then starting to add more changes back to my program, the problem also went away. So the attached test program lives in some middle realm where I have managed to strip a lot of things out, but have a reliable reproducible good and bad case.
parsebug.tar.gz
The text was updated successfully, but these errors were encountered: