Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the ghc api to get imports #32

Merged
merged 19 commits into from
May 18, 2022
Merged

Use the ghc api to get imports #32

merged 19 commits into from
May 18, 2022

Conversation

googleson78
Copy link
Contributor

@googleson78 googleson78 commented May 12, 2022

This replaces our own hand rolled parser.

Fixes #31

This replaces our own hand rolled parser.
We no longer use it!
-- TODO[GL]: propagate error instead
Left err -> do
GHC.printBagOfErrors dynFlags err
pure Nothing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should use error here instead, as was previously done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an error would be preferable. If we don't assume bugs in our code, it would indicate something to fix in the source file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@googleson78
Copy link
Contributor Author

This will introduce more work over at #25

Copy link
Member

@facundominguez facundominguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @googleson78! Looks pretty good. I left some comments.


let
-- [GL] The fact that the resulting strings here contain the "-X"s makes me a bit doubtful that this is the right approach,
-- but this is what I found for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usesTH is an heuristic that offers the user a convenient way to show that TH is needed, not an exact detection solution. 🤷

-- TODO[GL]: propagate error instead
Left err -> do
GHC.printBagOfErrors dynFlags err
pure Nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an error would be preferable. If we don't assume bugs in our code, it would indicate something to fix in the source file.

map GHC.unLoc $
GHC.getOptions dynFlags sb filePath
-- TODO: should both of the files here be filePath?
GHC.getImports dynFlags sb filePath filePath >>= \case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getImports can throw exceptions. There are functions to get those formatted too, but perhaps it is fine to have them kill the program without further ado.

We would crash later anyways, because the go part can't parse the
message we produce there.

This is justified because if the ghc parser fails something is very
wrong.
It's fine to pass both filePath as both arguments there because
they provide enough info and we aren't that worried about formatting of
error messages.
haskell_toolchain_library(name = "directory")

haskell_toolchain_library(name = "ghc")

# Only necessary to be able to import GHC.Platform in HImportScan.GHC.Settings
# TODO[GL]: we can probably remove this once we only support ghc >=9.2
haskell_toolchain_library(name = "ghc-boot")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is redundant now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it isn't - it seems I introduced another dependency on this package without knowing. Updated the comment.

@facundominguez
Copy link
Member

In some closed source file I get this error,

gazelle: himportscan: Illegal keyword 'type' (use ExplicitNamespaces to enable)

Probably we will need to enable every extension that affects the parsing of the import/export lists (PatternSynonyms comes to mind too).

I tested with and without runGhc too. There is a difference in performance, where the version with runGhc is 1.8 times slower. I think it would be worth reverting to the hand made dflags.

@facundominguez
Copy link
Member

Also, note that when we get the error message, there is no indication of the file that produced the failure. We should have himportscan inform that or the user of gazelle won't have a clue.

@facundominguez
Copy link
Member

I contributed some commits unrelated to my last comments. Let me know if you'd like any of those changed.

@@ -98,7 +99,7 @@ scanImports dynFlags filePath contents = do
-- This is far from ideal, however handling this better would require being able to communicate errors better to go.
Left err -> do
GHC.printBagOfErrors dynFlagsWithExtensions err
error "ghc parsing failed"
throwIO (GHC.mkSrcErr err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this commit fix the issue where users don't know what parsing failed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't. I tested this change before writing about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, after my change which splits stderr and stdout, we get the parse error that ghc reports, e.g.



/home/googleson78/git/gazelle_haskell_modules/example/package-a/app/Main.hs:9:59: error:
    parse error on input `PatTrue'
  |
9 | import PackageA.Other.C(type DataC, DataC(DataC), pattern PatTrue)
  |                                                           ^^^^^^^
himportscan: parse error on input `PatTrue'


gazelle: himportscan exited unsuccessfully exit status 1

Is this good enough, or should we explicitly report the file name separately?

@googleson78
Copy link
Contributor Author

How big is the 1.8 times difference considering all the other processing and generating going on. Is it 1.8 for the whole process, or only for the himportscan part? In other words, should I revert to the fake DynFlags solution?

@facundominguez
Copy link
Member

Is it 1.8 for the whole process, or only for the himportscan part?

It is 1.8 for the whole process.

In other words, should I revert to the fake DynFlags solution?

I think this would be preferable.

This turns out to be much faster (close to 2 times on real world
project).
@googleson78
Copy link
Contributor Author

I'm very surprised, but currently (or in general?) getImports seems to be totally ignoring "parse (lex?) errors" - for example, if I change import to mport in a file, gazelle still runs "successfully" (implying that himportscan does so as well). It just skips the files that have lex/parse errors in them, even though we should currently be throwing in the Left case of getImports. It seems it's never reached.

I need to investigate more.

@googleson78
Copy link
Contributor Author

I have confirmed that it returns a Right, and I have also formed a hypothesis on why:

Even if we have a typo, e.g. port or imort, this does not exclude port and imort from being some CPP macros or TH top level splices. Hence, it makes sense for getImports to not report an error.

However, I also checked the behaviour with the current implementation, and it (sensibly) behaves the same way. So I think that this might not be a regression at all (unlike I was previously assuming) and we can safely "ignore" it.

@googleson78
Copy link
Contributor Author

I added PatternSynonyms, ExplicitNamespaces and MagicHash. Do any others come to mind?

@googleson78 googleson78 force-pushed the 31_ghc-api-imports branch from d941324 to eaefe78 Compare May 18, 2022 09:13
@facundominguez
Copy link
Member

Do any others come to mind?

They don't. But I think they will be easy to add as we discover them.

Thanks for your insights and your patches.

@facundominguez facundominguez merged commit 1578ad5 into main May 18, 2022
@facundominguez facundominguez deleted the 31_ghc-api-imports branch May 18, 2022 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parse imports using the GHC api
2 participants