You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand why F16 is required for linear and slerp, but can we do passthrough of quantized layer, as currently it necessary to go via huge models and requantize, which is a big pain point.