Skip to content

Conversation

@vpietila-amd
Copy link
Contributor

@vpietila-amd vpietila-amd commented Nov 7, 2025

Proposed changes

Improve the forward convolution builder implementation and addressed leftover feedback left from PR #3138. Main changes

  • Refactored tests such that they reflect better the builder pattern. The templates and types for the convolution algorithm concepts are created via factory that facilitates programmatic creation of the device op instances.
  • Moved tests into anonymous namespace.
  • The convolution factory had lot of if-else constructs when CK Builder types were converted into CK library types. I had initially trouble in using static_assert in the default branch of switch as the static_assert was evaluated at compile time even for valid types. However, if we change the static_assert to throw "<error message>", it will result in a compile-time error only if the default branch is actually hit. This assumes that the function is consteval. Hence, changed all conversions in the convolution factory to use switch, which is more intuitive.
  • Removed the explicit device op definition from convolution signature and the corresponding predicate file. The device ops are defined by the corresponding concepts. This allowed to remove lot of boilerplate code from the convolution factory.
  • Adde inheritance and convolution algorithm specialization to handle device ops that are specialization of a more generic ones. The large tensor support is more naturally expressed by this pattern.
  • Added support for the FP8 data type.

vpietila-amd and others added 26 commits November 3, 2025 14:06
Copy link
Collaborator

@shumway shumway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this gets the large tensor information into the algorithm.

I'm not sure if base_algorithm is a good concept, but it seems to align well with our current design. We should merge this and then do a careful review of how users specify algorithms. I have a hunch that we can simplify this further, and having this concrete, working code is very helpful for reasoning about design and user experience.

@shumway shumway merged commit 7d57bc1 into develop Nov 13, 2025
50 checks passed
@shumway shumway deleted the vpietila/ckb-fwd-conv-builder-improvements branch November 13, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants