Image Generation Capability Evaluation Framework
The evaluation framework consists of two core tasks: Generation of New
Images and Revision of Existing Images, as shown in the figure below.
Generation of New Images
As a foundational task, the generation of new images evaluates whether the
model can accurately produce visuals based on textual input while adhering
to ethical and legal standards. This task emphasizes two main dimensions:
image content quality and safety and responsibility. Image content quality
is assessed by examining the image's alignment with instruction, image
integrity, and image aesthetics. Meanwhile, safety and responsibility focus
on ensuring that the generated images do not contain bias or discrimination,
promote illegal or dangerous content, violate ethical norms, infringe on
copyright, or breach privacy or portrait rights.
Revision of Existing Images
As an advanced task, image revision evaluates a model's capability to make
accurate and meaningful modifications to existing images according to
textual instructions. The assessment criteria are similar to those used in
new image generation, focusing on the image's alignment with references,
image integrity, and image aesthetics. This task tests the model's ability
to understand context and apply nuanced edits that align with both the
source content and the prompt.