Image Generation Capability Evaluation Framework

The evaluation framework consists of two core tasks: Generation of New Images and Revision of Existing Images, as shown in the figure below.

Generation of New Images

As a foundational task, the generation of new images evaluates whether the model can accurately produce visuals based on textual input while adhering to ethical and legal standards. This task emphasizes two main dimensions: image content quality and safety and responsibility. Image content quality is assessed by examining the image's alignment with instruction, image integrity, and image aesthetics. Meanwhile, safety and responsibility focus on ensuring that the generated images do not contain bias or discrimination, promote illegal or dangerous content, violate ethical norms, infringe on copyright, or breach privacy or portrait rights.

Revision of Existing Images

As an advanced task, image revision evaluates a model's capability to make accurate and meaningful modifications to existing images according to textual instructions. The assessment criteria are similar to those used in new image generation, focusing on the image's alignment with references, image integrity, and image aesthetics. This task tests the model's ability to understand context and apply nuanced edits that align with both the source content and the prompt.