feat: add prompt only image generation datasets by nifleisch · Pull Request #310 · PrunaAI/pruna

nifleisch · 2025-08-21T14:26:46Z

Description

Currently, we only support image-generation datasets consisting of prompt–image pairs. For many applications (e.g., evaluation agents, distillation), only the prompt is needed, and some benchmarking datasets for image-generation models consist of prompts only. This PR relaxes the requirement to always include images by adjusting the collate function. It also adds three common benchmarking datasets for image-generation models: DrawBench, PartiPrompts, and GenAiBench.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

I added tests for the newly added datasets that pass locally.
I tried the old and new image generation datasets together with the optimization agent.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

davidberenstein1957

Cool! Feel free to merge after the minor bug fixes and the tests pass.

davidberenstein1957 · 2025-08-21T14:34:39Z

+    """
+    ds = load_dataset("sayakpaul/drawbench", trust_remote_code=True)["train"]
+    ds = ds.rename_column("Prompts", "text")
+    return ds.select([0]), ds.select([0]), ds


Perhaps this is leftover from testing? I believe we would want split the dataset in train test and val datasets?

This decision was made deliberately to ensure that the evaluation agent uses all prompts in the benchmark for evaluation. Because these are benchmarking datasets, they are not intended for training models. But maybe I am missing cases in which it is favorable to split the benchmark dataset into train, val, and test.

Makes sense, for consistency, I would add a note why you are doing that. Also, wouldn't it be better to provide an empty dataset ?

I second the idea of empty datasets for training and validation here :)

begumcig

THEY ARE HERE! You're a champ 🥇🥇🥇🥇. One teeeny request about passing empty datasets for train/val modes. Also left one suggestion about the collate, but only if you also think that makes sense, left it up to you. Everything already looks so good!

begumcig · 2025-08-27T13:31:37Z



-def image_generation_collate(data: Any, img_size: int, output_format: str = "int") -> Tuple[List[str], torch.Tensor]:
+def image_generation_collate(


I totally get why it feels natural to link prompt datasets with the image_generation_collate. At the same time, I wonder if it might make sense to introduce a really simple, pass-through style collate function specifically for prompt-only datasets. I can imagine us adding more and more datasets in the future, also not just for images but also for video generation too! So extending the current collate might not be the most sustainable approach long term.

Really good point. I did not think about video models. With them in mind it really makes sense to treat the prompt datasets separately.

begumcig · 2025-08-27T13:32:10Z

+    """
+    ds = load_dataset("sayakpaul/drawbench", trust_remote_code=True)["train"]
+    ds = ds.rename_column("Prompts", "text")
+    return ds.select([0]), ds.select([0]), ds


I second the idea of empty datasets for training and validation here :)

begumcig · 2025-08-27T13:33:35Z

    "CIFAR10": (setup_cifar10_dataset, "image_classification_collate", {"img_size": 32}),
+    "DrawBench": (setup_drawbench_dataset, "image_generation_collate", {"img_size": None}),
+    "PartiPrompts": (setup_parti_prompts_dataset, "image_generation_collate", {"img_size": None}),
+    "GenAIBench": (setup_genai_bench_dataset, "image_generation_collate", {"img_size": None}),


I left a more detailed comment below, but if we were to have a separate collate, we also wouldn't have to set the image size for prompt datasets!

github-actions · 2025-09-07T00:07:49Z

This PR has been inactive for 10 days and is now marked as stale.

begumcig

Amazing job Nils! Super excited to use these datasets already 🏆🏆🏆. It could be useful to add an info message to the user about how they should use the test split, but everything looks super good to me, so already approved!

begumcig · 2025-09-08T13:19:10Z

    "Polyglot": (setup_polyglot_dataset, "question_answering_collate", {}),
    "OpenImage": (setup_open_image_dataset, "image_generation_collate", {"img_size": 1024}),
    "CIFAR10": (setup_cifar10_dataset, "image_classification_collate", {"img_size": 32}),
+    "DrawBench": (setup_drawbench_dataset, "prompt_collate", {}),


begumcig · 2025-09-08T14:10:30Z

+    """
+    ds = load_dataset("BaiqiL/GenAI-Bench")["train"]
+    ds = ds.rename_column("Prompt", "text")
+    return ds.select([0]), ds.select([0]), ds


After our discussion I see why we cannot pass an empty dataset for the train and validation. Do you think it would make sense to print an info / warning in the setup functions to let people know they should be using the test dataloader?

Yes, that sounds like a good idea. Will log an info message when loading the dataset.

github-actions · 2025-09-20T00:07:10Z

This PR has been inactive for 10 days and is now marked as stale.

nifleisch requested review from begumcig and davidberenstein1957 August 21, 2025 14:26

This comment was marked as outdated.

Sign in to view

davidberenstein1957 approved these changes Aug 21, 2025

View reviewed changes

davidberenstein1957 added the dataset label Aug 26, 2025

begumcig requested changes Aug 27, 2025

View reviewed changes

github-actions bot added the stale label Sep 7, 2025

nifleisch added 3 commits September 8, 2025 09:36

feat: add drawbench, partiprompts, genaibench

d4b6d7b

docs: add new datasets to documentation

5e282ce

refactor: use seperate collate function for prompt only datasets

1b8f003

nifleisch force-pushed the feat/add-prompt-datasets branch from e4ea39d to 1b8f003 Compare September 8, 2025 12:06

nifleisch requested a review from begumcig September 8, 2025 12:26

begumcig approved these changes Sep 8, 2025

View reviewed changes

github-actions bot removed the stale label Sep 9, 2025

github-actions bot added the stale label Sep 20, 2025

chore: add info about prompt datasets being test-only datsets

91c791f

nifleisch merged commit 3d99160 into main Sep 22, 2025
7 checks passed

nifleisch deleted the feat/add-prompt-datasets branch September 22, 2025 08:56



		def image_generation_collate(data: Any, img_size: int, output_format: str = "int") -> Tuple[List[str], torch.Tensor]:
		def image_generation_collate(

Conversation

nifleisch commented Aug 21, 2025

Description

Type of Change

How Has This Been Tested?

Checklist

Uh oh!

This comment was marked as outdated.

Uh oh!

davidberenstein1957 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

begumcig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 7, 2025

Uh oh!

begumcig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants