深圳大学计算机与软件学院

Creative Blends of Visual Concepts

ACM Conference on Human Factors in Computing Systems (CHI)

Zhida Sun¹, Zhenyao Zhang¹, Yue Zhang¹, Min Lu¹, Dani Lischinski², Daniel Cohen-Or^1,3, Hui Huang^1*

¹Shenzhen University ²Hebrew University of Jerusalem ³Tel Aviv University

Abstract

Visual blends combine elements from two distinct visual concepts into a single, integrated image, with the goal of conveying ideas through imaginative and often thought-provoking visuals. Communicating abstract concepts through visual blends poses a series of conceptual and technical challenges. To address these challenges, we introduce Creative Blends, an AI-assisted design system that leverages metaphors to visually symbolize abstract concepts by blending disparate objects. Our method harnesses commonsense knowledge bases and large language models to align designers’ conceptual intent with expressive concrete objects. Additionally, we employ generative text-to-image techniques to blend visual elements through their overlapping attributes. A user study (N=24) demonstrated that our approach reduces participants’ cognitive load, fosters creativity, and enhances the metaphorical richness of visual blend ideation. We explore the potential of our method to expand visual blends to include multiple object blending and discuss the insights gained from designing with generative AI.

Figure 1: Creative Blends operates through a multi-stage pipeline. The initial stage involves concept inference to identify relevant objects and their attributes. Subsequently, a similarity-based selection process empowers users to choose suitable object and attribute combinations. The system then explores potential blending schemes and synthesizes corresponding prompts for the T2I model, culminating in iterative image generation based on the selected prompts to support the ideation process.

Figure 2: The user interface of Creative Blends showcases an example of generated results for “global warming”. The interface consists of four distinct modules: the expression input module (a), the prompt exploration module (b, c, d), the visual blend exploration module (e), and the similarity visualization module (f).

Figure 3: The scheme generation prompt is structured into five key modules: (a) system setup, (b) task definition, (c) user input, (d) task execution process, and (e) results format demonstration. The (d) task execution process module outlines the methods and rationales for considering potential blending schemes. The results, processed by the (e) results format demonstration module, are returned in a standardized JSON format, ensuring compatibility with downstream processes.

Figure 4: The final prompt is composed of four distinct modules: (1) objects, (2) attributes, (3) schemes, and (4) considerations. The schemes and metaphorical themes (marked in capital letters with a grey background), essential elements of the prompt, are dynamically generated by the GPT in response to the scheme and metaphor generation prompts.

Figure 5: The user study procedure. It involved completing two tasks with both the Creative Blends and the baseline. To maintain fairness, the order of the systems and design tasks was counterbalanced.

Figure 6: The baseline interface integrated Google Search and ChatGPT for both text and visual search queries.

Figure 7: Creative Blends generates diverse visual blends representing abstract concepts based on user-provided expressions. Each topic includes eight examples: four highlighting different levels of object similarity and four demonstrating varying attribute similarity. Similarity increases from left to right. The attributes are extended based on the objects enclosed by the double brackets. Colors within the topics serve to identify concepts and their associated objects and attributes.

Figure 8: The Sample ideas generated by Creative Blends. These examples are randomly selected from the topics used in previous research or commonly used in our daily lives.

Paper

https://dl.acm.org/doi/10.1145/3706598.3713683

Acknowledgement

We thank all anonymous reviewers for their insightful feedback. We especially thank Elad Richardson for his valuable input. This research was supported in parts by ICFCRT (W2441020), Guangdong Basic and Applied Basic Research Foundation (2023B1515120026), Shenzhen Science and Technology Program (KQTD20210811090044 003, RCJC20200714114435012, 20231122121504001), NSFC (62472288), ISF (3441/21, 3611/21, 2203/24), and Scientific Development Funds from Shenzhen University.

Bibtex

@inproceedings{10.1145/3706598.3713683,

author = {Sun, Zhida and Zhang, Zhenyao and Zhang, Yue and Lu, Min and Lischinski, Dani and Cohen-Or, Daniel and Huang, Hui},

title = {Creative Blends of Visual Concepts},

year = {2025},

isbn = {9798400713941},

publisher = {Association for Computing Machinery},

address = {New York, NY, USA},

url = {https://doi.org/10.1145/3706598.3713683},

doi = {10.1145/3706598.3713683},

abstract = {Visual blends combine elements from two distinct visual concepts into a single, integrated image, with the goal of conveying ideas through imaginative and often thought-provoking visuals. Communicating abstract concepts through visual blends poses a series of conceptual and technical challenges. To address these challenges, we introduce Creative Blends, an AI-assisted design system that leverages metaphors to visually symbolize abstract concepts by blending disparate objects. Our method harnesses commonsense knowledge bases and large language models to align designers’ conceptual intent with expressive concrete objects. Additionally, we employ generative text-to-image techniques to blend visual elements through their overlapping attributes. A user study (N=24) demonstrated that our approach reduces participants’ cognitive load, fosters creativity, and enhances the metaphorical richness of visual blend ideation. We explore the potential of our method to expand visual blends to include multiple object blending and discuss the insights gained from designing with generative AI.},

booktitle = {Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems},

articleno = {542},

numpages = {17},

keywords = {Visual Blends, Metaphor, Text-to-Image Generation, Creativity},

location = {

series = {CHI '25}

}

Downloads

Paper