I'm trying to reproduce the FID in Table 2, using your provided checkpoint, but found a huge difference. I suppose it's because I didn't use the same set of 3K COCO captions to generate images and calculate FID.
So, could you share the index of the 3K COCO captions that are used for FID calculation?
And btw, is "3K COCO captions" in paper section 5.1 a typo? I see the papers you cited [16, 46, 50] all use 30K instead of 3K.
Thanks!
I'm trying to reproduce the FID in Table 2, using your provided checkpoint, but found a huge difference. I suppose it's because I didn't use the same set of 3K COCO captions to generate images and calculate FID.
So, could you share the index of the 3K COCO captions that are used for FID calculation?
And btw, is "3K COCO captions" in paper section 5.1 a typo? I see the papers you cited [16, 46, 50] all use 30K instead of 3K.
Thanks!