Custom image size for pretrained models #565

mxc19912008 · 2021-04-14T17:19:09Z

mxc19912008
Apr 14, 2021

Hi,

I am trying to use Transformer in transformer model and the default image size is 224, is there a way to use pretrained models but change the image size to 320 or other sizes?

Thanks!

rwightman · 2021-04-14T18:26:30Z

rwightman
Apr 14, 2021
Maintainer

Yes, you need to pass a different img_size when you create the model, it will interpolate the position embedding when it loads the pretrained weights. It needs to be evenly divisible by the patch size. It should work with the vit, vit_deit, vit_deit_distilled. Has not been implemented for pit, swin, and tnt yet.

However you cannot just change the input image on the fly like other models, there is a bit of loss in the interpolation so it relies on fine-tuning to bring the accuracy up on the different sized model. The current train scripts don't have the img size arg plumbed through (would break the convnets right now), so you if you use the scripts here you need to manually hack the img_size arg in.

>>> m = timm.create_model('vit_deit_small_patch16_224', img_size=256, pretrained=True)
>>> m.pos_embed.shape
torch.Size([1, 257, 384])

5 replies

mxc19912008 Apr 15, 2021
Author

Thanks! I was able to change other parameters like patch size to make the image size larger, in this way the parameters of larger image size model will stay the same with those in the checkpoint.

JohnG0024 Feb 25, 2022

@rwightman Hi, I'd like to use vit_large_patch16_384 but the default image size is 384, while my images are 896. Should I do the same as

>>> m = timm.create_model('vit_large_patch16_384', img_size=896, pretrained=True)
>>> m.pos_embed.shape
torch.Size([1, 3137, 384])

Or is there another pretrained model I should use?

rwightman Feb 26, 2022
Maintainer

@JohnG0024 that's the closest one, but big warning, it's going to be insanely expensive to train! these models do not scale well with resolution

JohnG0024 Feb 26, 2022

@rwightman How expensive is it compared to training without a pre-trained model? Should I try tf_efficientnet_l2_ns instead? Its input shape is (3, 800, 800)

deshwalmahesh Mar 19, 2022

Yes, you need to pass a different img_size when you create the model, it will interpolate the position embedding when it loads the pretrained weights. It needs to be evenly divisible by the patch size. It should work with the vit, vit_deit, vit_deit_distilled. Has not been implemented for pit, swin, and tnt yet.

However you cannot just change the input image on the fly like other models, there is a bit of loss in the interpolation so it relies on fine-tuning to bring the accuracy up on the different sized model. The current train scripts don't have the img size arg plumbed through (would break the convnets right now), so you if you use the scripts here you need to manually hack the img_size arg in.
>>> m = timm.create_model('vit_deit_small_patch16_224', img_size=256, pretrained=True)
>>> m.pos_embed.shape
torch.Size([1, 257, 384])

Hi, I am testing the SWIN models on down steam tasks like segmentation and others. Problem is that in their approach for segmentation, they used the UperNet with image size of 512 but here we have a maximum size of 256. Let us suppose I want to use it as a backbone for DeepLabV3+ and need to use bigger size images then what could be the best possible solution doing this? Do I have to just change the argument img_size = 512 or there is some other solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Custom image size for pretrained models #565

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Custom image size for pretrained models #565

Uh oh!

mxc19912008 Apr 14, 2021

Replies: 1 comment · 5 replies

Uh oh!

rwightman Apr 14, 2021 Maintainer

Uh oh!

mxc19912008 Apr 15, 2021 Author

Uh oh!

Uh oh!

JohnG0024 Feb 25, 2022

Uh oh!

rwightman Feb 26, 2022 Maintainer

Uh oh!

Uh oh!

JohnG0024 Feb 26, 2022

Uh oh!

deshwalmahesh Mar 19, 2022

mxc19912008
Apr 14, 2021

Replies: 1 comment 5 replies

rwightman
Apr 14, 2021
Maintainer

mxc19912008 Apr 15, 2021
Author

rwightman Feb 26, 2022
Maintainer