https://github.com/hazan-lab/flash-stu/blob/54f78d4b91cc068ccc23554172beab1277aa8ab1/flash_stu/utils/stu_utils.py#L55 Are you sure that the model is causal?
flash-stu/flash_stu/utils/stu_utils.py
Line 55 in 54f78d4
Are you sure that the model is causal?