Text-to-Image Models - HKU Business School AI Evaluation Lab

Text-to-Image Models

Text-to-image models are machine learning models that create images based on textual descriptions, translating written prompts into visual content. They are commonly used in art generation, design prototyping, and creative content creation across various industries.

Text-to-image Models

Model	Version	Institution
360 Zhihui	360 Zhihui	360
Wanx 2	Wanx-v2	Alibaba
Wenxin Yige 2	Wenxin Yige 2	Baidu
Dreamina	Dreamina	ByteDance
DeepSeek Janus-Pro	DeepSeek Janus-Pro	DeepSeek
SenseMirage 5	SenseMirage V5.0	SenseTime
Hunyuan-DiT	Hunyuan-DiT	Tencent
MiaoBiShengHua	MiaoBiShengHua	Vivo
CogView3 - Plus	CogView3 - Plus	Zhipu AI
DALL-E 3	DALL-E 3	OpenAI
FLUX.1 Pro	FLUX.1 Pro	Black Forest Labs
Imagen 3	Imagen 3	Alpha (Google)
Midjourney	Midjourney v6.1	Midjourney
Playground	Playground v2.5	Playground AI
Stable Diffusion 3	Stable Diffusion 3 Large	Stability AI

Leaderboards

Image Generation

Note: This list was updated in Jan 2025