Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify lowering of aten.reflection_pad2d to linalg #2772

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

frederik-h
Copy link
Contributor

The lowering uses a linalg.extract_slice operation to extract the input slices to be used for the padding followed by a linalg.generic operation to reverse the order of elements as required for the reflection.

This PR removes the use of the linalg.generic. It uses the capability of the linalg.extract_slice operation to extract a reversed slice by specifying negative strides and offsets that point to the end of the slice in the input tensor. This simplifies the generated IR and makes it potentially more optimization friendly.

The lowering uses a linalg.extract_slice operation to extract the
input slices to be used for the padding followed by a linalg.generic
operation to reverse the order of elements as required for the
reflection.

This commit removes the use of the linalg.generic. It uses the
capability of the linalg.extract_slice operation to extract a reversed
slice by specifying negative strides and offsets that point to the end
of the slice in the input tensor. This simplifies the generated IR
and makes it potentially more optimization friendly.
Copy link
Collaborator

@newling newling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification!

@frederik-h
Copy link
Contributor Author

Nice simplification!

Thanks!

Copy link
Collaborator

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to get a second opinion first. @MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

@frederik-h
Copy link
Contributor Author

Just want to get a second opinion first.
@MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

@qedawkins
Copy link
Collaborator

Just want to get a second opinion first.
@MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

Not saying we shouldn't support this, just that there are a few places I could see this going wrong. I'll unblock for now but (at least from IREE's perspective, who I think is the main consumer of these Linalg lowerings) it's not immediately clear to me this is better than the previous lowering. The end state I'm expecting anyway is to support + use tensor.pad.

I hope my first message didn't come off the wrong way at first either, I do appreciate the code changes/simplifications here, thanks for the good work :)

@qedawkins qedawkins self-requested a review January 22, 2024 17:22
@qedawkins qedawkins dismissed their stale review January 22, 2024 17:23

unblocking

@frederik-h
Copy link
Contributor Author

Just want to get a second opinion first.
@MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

Not saying we shouldn't support this, just that there are a few places I could see this going wrong. I'll unblock for now

Thanks, but I will wait a bit for feedback from @MaheshRavishankar before I merge the changes.

The end state I'm expecting anyway is to support + use tensor.pad.

Something like this would probably be ideal. However I am wondering if it makes sense to extend tensor.pad to be able to take tensors for the padding on the different sides of the input instead of using tensor.extract and complex select operations in the body of tensor.pad.

I hope my first message didn't come off the wrong way at first either, I do appreciate the code changes/simplifications here, thanks for the good work :)

It didn't! Thanks for your feedback.

@MaheshRavishankar
Copy link
Contributor

I think it depends on what torch-mlir wants to model, but in general I think negative strides are a bad idea. They add unnecessary complications to the rest of the stack. I am fairly certain this is relatively untested path (the patch is from end of 2022, and not sure how well it was plumbed through).
I dont think I have standing to block this, cause really these are downstream issues, but I would try to avoid negative strides without making sure they work in an end-to-end stack.

@frederik-h
Copy link
Contributor Author

I think it depends on what torch-mlir wants to model, but in general I think negative strides are a bad idea. They add > unnecessary complications to the rest of the stack.

@stellaraccident What do you think? I just wanted to try out if this works and I do not have any strong opinions on what is better. I see that this can cause downstream issues if the possibility of negative strides has not been documented previously and there have been no uses so far.

@stellaraccident
Copy link
Collaborator

stellaraccident commented Feb 8, 2024

I expect that Ivan actually added it to support cases like this (but when coming from users, where you have fewer options to avoid), given what I know of his work.

Stride hacks like this are a common thing to do in linear algebra libraries, but as Mahesh says... Sometimes it is best to stay on more sure footing.

You can ask Ivan (he's on discord as hc84) and/or test it out and understand corners.

Since this is an "internal" detail vs something we're trying to model from the user, I would probably take the more sure path and focus my time in getting to the end state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants