Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle long long type in TBAA #1241

Closed
PikachuHyA opened this issue Dec 19, 2024 · 8 comments
Closed

handle long long type in TBAA #1241

PikachuHyA opened this issue Dec 19, 2024 · 8 comments

Comments

@PikachuHyA
Copy link
Collaborator

The #1220 adds support for scalar types. However, it does not handle the long long type correctly; instead, long long is treated as long. Specifically, long long is represented as s64i in CIR, which is then mapped to the long type.

@Lancern
Copy link
Member

Lancern commented Dec 19, 2024

The problem here is that CIR integer types do not have a 1:1 correspondence to C/C++ integer types, so you could not tell whether a s64i come from long or long long or std::int64_t. On contrary, CIR floating point types have a 1:1 correspondence to C/C++ floating point types. long double would be lowered to !cir.long_double instead of !cir.double or !cir.fp80 depending on the target. Maybe we should do the same for integer types?

@PikachuHyA
Copy link
Collaborator Author

Maybe we should do the same for integer types?

That sounds like a great idea! I will attempt to add !cir.long_long<underlying>, similar to this:

def CIR_LongLong : CIR_Type<"LongLong", "long_long"> {
  let parameters = (ins "mlir::Type":$underlying);

  let assemblyFormat = [{
    `<` $underlying `>`
  }];
}

@bcardosolopes
Copy link
Member

bcardosolopes commented Dec 19, 2024

This is a bit more tricky than it sounds, it would be odd to have cir.long_long but not do the same for all other integer types (e.g. on some architectures int can be 2 bytes and would suffer from a similar problem).

I do agree FP is doing the more full and nicer path and it'd be nice to have it for integers (makes sense to better map the source info), but is the type indirection worth it? Seems like the only use so far is for handling TBAA - I'm curious if there are solutions where we can solve this without introducing new types?

(cc @dkolsen-pgi @sitio-couto)

@dkolsen-pgi
Copy link
Collaborator

Floating-point types in ClangIR do not have a 1-to-1 mapping to formats. Or they won't once extended floating-point support is added to Clang. double, _Float64, and _Float32x will all be distinct types in C++, but they will all map to the same !cir.double type. I am not thrilled with long double being a special kind of floating-point type that has an underlying type.

Doing something special for just the type long long is not the correct thing. If there is just one integral type that should be handled specially, it is long. long long is 64-bits on all supported platforms. char, short, and int are 8, 16, and 32 bits on almost all supported platforms. It is long that varies between 32 bits (Windows) and 64 bits (MacOS and Linux) on commonly used platforms.

If we need to preserve the original C++ type in ClangIR, for TBAA or some other purpose, then we should do it right and have separate ClangIR types for all arithmetic language types, each with a corresponding underlying format attribute. But that would be a big change and not something to be done lightly.

@bcardosolopes
Copy link
Member

Agreed, same feeling here!

@PikachuHyA an alternative is to optionally attach the clangtype to CIRtypes, like we do for ast nodes on operations, TBAA could query that information to grasp higher level semantics.

@PikachuHyA
Copy link
Collaborator Author

Over the past few weeks, I explored three methods to handle the long long type.

First, I attempted to implement !cir.long_long<underlying>. This approach was rather tricky and introduced some bugs. I believe it is not the right way to handle the long long type.

Second, I investigated two ways to associate Clang types with CIR types. The first involved adding an interface for querying information, similar to ASTRecordDeclInterface. Unfortunately, It is not successful. The second method involved adding an AST attribute to TBAAScalarAttr. However, when lowering scalar types, this required the ASTContext. I had to pass a pointer to ASTContext from CIRGenConsumer::HandleTranslationUnit to lowerCIRTBAAAttr through a lengthy chain of calls. This requirement for ASTContext hindered the cir-translation because ASTContext was unavailable when using this tool.

In the end, I decided to encode the type name in TBAAScalarAttr while retaining the converted type. Here is the new implementation. #1329

@bcardosolopes
Copy link
Member

In the end, I decided to encode the type name in TBAAScalarAttr while retaining the converted type. Here is the new implementation

This sounds good to me, the other solution sounds a bit heavy handed for what we want to achieve at the moment. Thanks!

bcardosolopes pushed a commit that referenced this issue Feb 11, 2025
This patch introduces support for TBAA with scalar types. By encoding
the type name in the CIR, we address the limitation of distinguishing
between different C++ types that map to the same CIR type. For example,
long and long long types.
see #1241
@PikachuHyA
Copy link
Collaborator Author

Closing this issue because we have successfully merged #1329 .

lanza pushed a commit that referenced this issue Mar 18, 2025
This patch introduces support for TBAA with scalar types. By encoding
the type name in the CIR, we address the limitation of distinguishing
between different C++ types that map to the same CIR type. For example,
long and long long types.
see #1241
xlauko pushed a commit to trailofbits/instafix-llvm that referenced this issue Mar 28, 2025
This patch introduces support for TBAA with scalar types. By encoding
the type name in the CIR, we address the limitation of distinguishing
between different C++ types that map to the same CIR type. For example,
long and long long types.
see llvm/clangir#1241
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants