Hello,
I am using Vit base model creation with patch size 16, but getting an error when loading checkpoint:
Here is the vision transformer class I am using: https://github.com/facebookresearch/dino/blob/main/vision_transformer.py
size mismatch for pos_embed: copying a param with shape torch.Size([1, 198, 768]) from checkpoint, the shape in current model is torch.Size([1, 197, 768]).
Could you tell what could be the difference?
Thanks,
Rohan