inception_v3
requires an input of (299, 299)
while other models requires an input of (224, 224).
Due to adaptive pooling used in some models,
they can run on varying sized intput without throwing errors
(but the results are usually not correct).
You have to resize/crop an image to be the right input size
(and then other necessary transformations, e.g., to_tensor