You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
as you can see, function l1 is a composition of l2 and l3, since l1(i) equals l2(l3(i)) all the time and l3 is compatible with l1, so l1 is by definition a composition of l2 and l3,
how ever, I commented out //auto cc = composition(l2,l3); from the code,
if I enable auto cc = composition(l2,l3);
the program can't be compiled, the system assumes this is not a valid case.
This doesn't really make much sense, does it?
#include <cuda.h>
#include <stdlib.h>
#include <cute/tensor.hpp>
#include <type_traits>
using namespace cute;
int main()
{
auto l1 = make_layout(Shape<_3,Shape<_2,_2>>{}, Stride<_16,Stride<_80,_4>>{});
auto l2 = make_layout(Shape<_10,_2>{}, Stride<_16,_4>{});
auto l3 = make_layout(Shape<_3,_4>{},Stride<_1,_5>{});
for(int i = 0;i < 12;i++){
printf("trying %d res1= %d res2 = %d\n",i,l1(i),l2(l3(i)));
}
printf("%d\n",(int)compatible(l3,l1)());
auto cc = composition(l2,l3);
}
Here is the error:
cutlass/include/cute/int_tuple.hpp(404): error: static assertion failed with "Static shape_div failure"
detected during:
instantiation of "auto cute::shape_div(const IntTupleA &, const IntTupleB &) [with IntTupleA=cute::_3, IntTupleB=cute::C<10>]"
The text was updated successfully, but these errors were encountered:
Good observation. This is known and more robust+efficienct versions of almost all CuTe operations will be released soon along with a corresponding Whitepaper and updated documentation proving/describing the CuTe core.
Good observation. This is known and more robust+efficienct versions of almost all CuTe operations will be released soon along with a corresponding Whitepaper and updated documentation proving/describing the CuTe core.
I was reading the documentation here,
as regard to layout composition, However, I found that this piece of code:
could eliminate some valid cases,
here is one example:
#include <cuda.h>
#include <stdlib.h>
#include <cute/tensor.hpp>
#include <type_traits>
using namespace cute;
int main()
{
auto l1 = make_layout(Shape<_3,Shape<_2,_2>>{}, Stride<_16,Stride<_80,_4>>{});
auto l2 = make_layout(Shape<_10,_2>{}, Stride<_16,_4>{});
auto l3 = make_layout(Shape<_3,_4>{},Stride<_1,_5>{});
for(int i = 0;i < 12;i++){
printf("trying %d res1= %d res2 = %d\n",i,l1(i),l2(l3(i)));
}
printf("%d\n",(int)compatible(l3,l1)());
//auto cc = composition(l2,l3);
}
here is the output:
trying 0 res1= 0 res2 = 0
trying 1 res1= 16 res2 = 16
trying 2 res1= 32 res2 = 32
trying 3 res1= 80 res2 = 80
trying 4 res1= 96 res2 = 96
trying 5 res1= 112 res2 = 112
trying 6 res1= 4 res2 = 4
trying 7 res1= 20 res2 = 20
trying 8 res1= 36 res2 = 36
trying 9 res1= 84 res2 = 84
trying 10 res1= 100 res2 = 100
trying 11 res1= 116 res2 = 116
1
as you can see, function l1 is a composition of l2 and l3, since l1(i) equals l2(l3(i)) all the time and l3 is compatible with l1, so l1 is by definition a composition of l2 and l3,
how ever, I commented out //auto cc = composition(l2,l3); from the code,
if I enable
auto cc = composition(l2,l3);
the program can't be compiled, the system assumes this is not a valid case.
This doesn't really make much sense, does it?
#include <cuda.h>
#include <stdlib.h>
#include <cute/tensor.hpp>
#include <type_traits>
using namespace cute;
int main()
{
auto l1 = make_layout(Shape<_3,Shape<_2,_2>>{}, Stride<_16,Stride<_80,_4>>{});
auto l2 = make_layout(Shape<_10,_2>{}, Stride<_16,_4>{});
auto l3 = make_layout(Shape<_3,_4>{},Stride<_1,_5>{});
for(int i = 0;i < 12;i++){
printf("trying %d res1= %d res2 = %d\n",i,l1(i),l2(l3(i)));
}
printf("%d\n",(int)compatible(l3,l1)());
auto cc = composition(l2,l3);
}
Here is the error:
cutlass/include/cute/int_tuple.hpp(404): error: static assertion failed with "Static shape_div failure"
detected during:
instantiation of "auto cute::shape_div(const IntTupleA &, const IntTupleB &) [with IntTupleA=cute::_3, IntTupleB=cute::C<10>]"
The text was updated successfully, but these errors were encountered: