feat: add prompt only image generation datasets#310
Conversation
davidberenstein1957
left a comment
There was a problem hiding this comment.
Cool! Feel free to merge after the minor bug fixes and the tests pass.
| """ | ||
| ds = load_dataset("sayakpaul/drawbench", trust_remote_code=True)["train"] | ||
| ds = ds.rename_column("Prompts", "text") | ||
| return ds.select([0]), ds.select([0]), ds |
There was a problem hiding this comment.
Perhaps this is leftover from testing? I believe we would want split the dataset in train test and val datasets?
There was a problem hiding this comment.
This decision was made deliberately to ensure that the evaluation agent uses all prompts in the benchmark for evaluation. Because these are benchmarking datasets, they are not intended for training models. But maybe I am missing cases in which it is favorable to split the benchmark dataset into train, val, and test.
There was a problem hiding this comment.
Makes sense, for consistency, I would add a note why you are doing that. Also, wouldn't it be better to provide an empty dataset ?
There was a problem hiding this comment.
I second the idea of empty datasets for training and validation here :)
begumcig
left a comment
There was a problem hiding this comment.
THEY ARE HERE! You're a champ 🥇🥇🥇🥇. One teeeny request about passing empty datasets for train/val modes. Also left one suggestion about the collate, but only if you also think that makes sense, left it up to you. Everything already looks so good!
|
|
||
|
|
||
| def image_generation_collate(data: Any, img_size: int, output_format: str = "int") -> Tuple[List[str], torch.Tensor]: | ||
| def image_generation_collate( |
There was a problem hiding this comment.
I totally get why it feels natural to link prompt datasets with the image_generation_collate. At the same time, I wonder if it might make sense to introduce a really simple, pass-through style collate function specifically for prompt-only datasets. I can imagine us adding more and more datasets in the future, also not just for images but also for video generation too! So extending the current collate might not be the most sustainable approach long term.
There was a problem hiding this comment.
Really good point. I did not think about video models. With them in mind it really makes sense to treat the prompt datasets separately.
| """ | ||
| ds = load_dataset("sayakpaul/drawbench", trust_remote_code=True)["train"] | ||
| ds = ds.rename_column("Prompts", "text") | ||
| return ds.select([0]), ds.select([0]), ds |
There was a problem hiding this comment.
I second the idea of empty datasets for training and validation here :)
| "CIFAR10": (setup_cifar10_dataset, "image_classification_collate", {"img_size": 32}), | ||
| "DrawBench": (setup_drawbench_dataset, "image_generation_collate", {"img_size": None}), | ||
| "PartiPrompts": (setup_parti_prompts_dataset, "image_generation_collate", {"img_size": None}), | ||
| "GenAIBench": (setup_genai_bench_dataset, "image_generation_collate", {"img_size": None}), |
There was a problem hiding this comment.
I left a more detailed comment below, but if we were to have a separate collate, we also wouldn't have to set the image size for prompt datasets!
|
This PR has been inactive for 10 days and is now marked as stale. |
e4ea39d to
1b8f003
Compare
begumcig
left a comment
There was a problem hiding this comment.
Amazing job Nils! Super excited to use these datasets already 🏆🏆🏆. It could be useful to add an info message to the user about how they should use the test split, but everything looks super good to me, so already approved!
| "Polyglot": (setup_polyglot_dataset, "question_answering_collate", {}), | ||
| "OpenImage": (setup_open_image_dataset, "image_generation_collate", {"img_size": 1024}), | ||
| "CIFAR10": (setup_cifar10_dataset, "image_classification_collate", {"img_size": 32}), | ||
| "DrawBench": (setup_drawbench_dataset, "prompt_collate", {}), |
| """ | ||
| ds = load_dataset("BaiqiL/GenAI-Bench")["train"] | ||
| ds = ds.rename_column("Prompt", "text") | ||
| return ds.select([0]), ds.select([0]), ds |
There was a problem hiding this comment.
After our discussion I see why we cannot pass an empty dataset for the train and validation. Do you think it would make sense to print an info / warning in the setup functions to let people know they should be using the test dataloader?
There was a problem hiding this comment.
Yes, that sounds like a good idea. Will log an info message when loading the dataset.
|
This PR has been inactive for 10 days and is now marked as stale. |
Description
Currently, we only support image-generation datasets consisting of prompt–image pairs. For many applications (e.g., evaluation agents, distillation), only the prompt is needed, and some benchmarking datasets for image-generation models consist of prompts only. This PR relaxes the requirement to always include images by adjusting the collate function. It also adds three common benchmarking datasets for image-generation models: DrawBench, PartiPrompts, and GenAiBench.
Type of Change
How Has This Been Tested?
Checklist