How to Read a Generative AI Image System: Diffusion models as a techno-social entanglement

Format: Papers, RSD12, RSD12 Papers Pre-release, Topic: Cases & Practice, Topic: Mapping & Modelling

Eryk Salvaggio

Generative media synthesis tools have quickly gathered the public’s attention through products such as ChatGPT for text, or DALL-E 2, Midjourney and Stable Diffusion for images. However, the design of these systems is typically obscured through interfaces (buttons labelled “imagine” or “dream”) and through the misleading label of “intelligence,” with commentators likening these systems’ behaviours to human creativity and ingenuity. Until now, the functions of these systems have focused on machine learning white papers, narrowly addressing internal technical processes. As policymakers, educators and the public grapple with these black boxes, this paper offers a systems-level analysis to clarify the entanglement of these technical systems within a broader context of data collection practices, generative models, user interfaces, generated and source images, and the broader media and cultural spheres in which they circulate. Ecological impacts and human labour concerns are also acknowledged. This paper maps out a systemic analysis of generative AI using a particular AI image generation system, Stable Diffusion, intended as a model and means to provide a common language for discussing and addressing these entanglements. Revealing the structures and relationships between the “systems within AI systems” is a means to engage with ethical controversies and techno-social possibilities more thoughtfully.


KEYWORDS: Artificial Intelligence, Generative AI, Diffusion Models, AI Ethics

RSD TOPIC(S): Cases & Practice, Mapping & Modelling, Society & Culture, Sociotechnical Systems




