In the context of the artistic research titled “Typologies of Delusion“, we recently produced two artworks and a series of prototypes presented at the Netherlands Institute of Sound and Vision during their RemixFest, a day dedicated to reusing and actualising media archives.
The artworks consist of two large (65-inch) touchscreens showcasing an extensive collection of AI-generated images, each made with Stable Diffusion XL as a base model, and a series of five different LoRAs as refiners. These LoRAs were trained with imagery from different decades of the Dutch Newsreel Polygoon Journaal, a well-known collection in The Netherlands, preserved by the Institute.
Hypothesis
The idea behind the artworks, titled “Typology 01: Faces” and “Typology 02: Film Stills”, was to study the potential of generative AI to interrogate historical archives, inviting viewers to discover less explored narratives resting in what I’d like to describe as the “archive’s unconscious”.
While it can be debatable whether or not machines have imagination, a meaningful byproduct of generative AI is to spark curiosity in the human mind. The mysterious doings between the inputs and outputs of a neural network are hard to grasp for those of us who spend most of our lives in a three-dimensional world, instead of a world of statistics and mathematical abstractions. What’s more, these doings are challenging to illustrate in a non-mathematical language.
And yet, if we think of “AI” as a creative tool, this section of the pipeline, where the statistical “magic” happens, right between inputs and outputs, prompts and images, is precisely what makes the technology one of a kind. That’s why, despite its ineffability, exploring ways to visualise this space is worth the effort.
Typologies of Delusion 01: Faces
The first work of this series features a collection of portraits created using pre-trained AI models informed by Polygoon Journaal. We employed the simplest form of prompt possible to generate images of distinctly diverse faces expressing various emotions.
The design of the prompt we used is based on a three-factor matrix, which correlates to the three axes in which the visual typology is arranged. In this setting “x” stands for human roles, “y” stands for emotions and “z” for decades.
These portraits put to test the versatility of the models we trained, while also inviting the viewer to explore the aesthetics of the Polygoon Journaal through the lens of artificial intelligence.
Click here to access a web adaptation of “Typologies of Delusion 01: Faces” – Please note this interface has not been fully optimized for web browsers, for that reason the user experience might be a bit clumsy. If you are on a laptop, keep the cursor clicked to drag the images and navigate through the collection. You may use the “z” and “x” keys as a zooming tool.
Imperfections as assets
If judged from a photorealist perception of reality, the images in this typology show many flaws or “mistakes”, but this was left deliberately. It was an artistic choice to keep the prompts as simple as possible in order to let the models reveal their aesthetic choices by imagining unprovided details and, in doing so, shine some light on the imagery created from the reuse of the Polygoon Journaal.
The experimental nature of this artwork is focused on uncovering the subtle nuances that expose the synthetic origin of these images. We view them as an opportunity to reflect on the underlying biases and intricacies embedded within the models we employed.
Furthermore, this typology is a way to speculate on the potential applications of image generation as an artistic technique, which could benefit academic disciplines seeking a deeper grasp of visual archives. Examples that come to mind are the emerging field of deformative criticism in cultural studies, visual history (through the visual expansion of archives) and archive preservation.
Typologies 02: Film Stills
The second work of this series, titled “Typologies 02: Film Stills”, follows the same format as its predecessor. Just like before, a collection of images is presented side by side in a visual typology. But this time the exploration delves even further into the realms of AI simulation, forcing the limits of a visible connection between the generated images and their main reference, the Polygoon imagery.
The pictures in Typologies 02 simulate Film stills from imaginary movies generated by an AI workflow that keeps at its core the models trained with Polygoon News, but adds more complexity to the design of its prompts. Unlike in Typologies 01, where prompts were kept in their simplest form, Typologies 02 adds an optimisation module that summons Open AI’s GPT3.5 model to reformulate the prompt in order to make it more likely to trigger Stable Diffusion –our image generation software– to spawn cinematic images.
The vision behind Typologies 02 was to explore the potential of generative AI to imagine dramatic scenes that could be used as a starting point for fictional stories informed by a historical archive of images, in this case, the Polygoon Journaal. For that reason, the initial prompts introduced in the matrix consist of “film clichés”, or dramatic situations that are commonly used in narrative films.
Click here to access a web version of “Typologies of Delusion 02: Film Stills” (please know this interface has not been fully optimized for web browsers – if you are on a laptop, keep the cursor clicked to drag the images and navigate through the collection. You may use the “z” and “x” keys as a zooming tool).
Compared to Typologies 01, this experiment is less insightful about the weights and biases of the Polygoon collection, and more instrumental for the creative process of brainstorming. Storytellers working in fields such as archive presentation, non-fiction and historical fiction, could benefit from further developments of tools like this.
Unveiling training flaws
It is important to highlight that the models trained with Polygoon Journaal’s imagery are LoRAs that act as refiners of the main model in use, also known as the checkpoint model, which in this case is SDXL 0.9. This may explain a certain lack of stylistic coherence between the Polygoon imagery and some of the images presented in the visual typologies.
In Typologies 01, a good example is the appearance of policemen wearing American uniforms, instead of Dutch uniforms, as would be expected from an AI model trained with a Dutch Newsreel. The most plausible explanation for this mishap is that the refined models lacked images of Policemen, which forced Stable Diffusion to use its main model (SDXL 0.9) as its only reference.
In Typologies 02 it is possible to see many images where the presence of the main model overrules the refined models. Once again, this can be explained by the models’ rigidity. To make the Polygoon style more present in situations where the prompt is more elaborated than just a few words, more training should be needed.
Some Final Thoughts
The two visual typologies presented at Sound and Vision’s Remix Festival, part of the project “Typologies of Delusion“, are a valuable exploration into the intersection of generative AI and the reuse of historical archives. By sparring with the power of AI models trained on the Polygoon Journaal (using Stable Diffusion), this artistic research challenges our perception of visual storytelling and opens doors to new possibilities of using generative AI as a creative tool.
There is no question that these experiments are just a taster of the potential of this AI-powered workflow, either as a research or a creative tool. Our models were trained with less than 100 images each, and we think that more variety of images and better tagging can make the models much more flexible, which may lead to much better results when it comes to reimagining an archive or using it for visual brainstorming.
Moreover, the deliberate inclusion of flaws in generated images in Typologies 01, and the lack of clear resemblance between Polygoon images and generated images in Typologies 02 invite viewers to contemplate the biases and choices inherent within AI systems. Studying such AI aesthetics through artistic production is an invitation to professionals and the general public to think critically about the ethical implications of AI technologies and the importance of understanding the mechanisms that shape the outputs.
All things considered, we firmly believe that in this time of increasing use of generative AI for media creation, these two visual typologies can provide us with a valuable glimpse into a creative process whereby historical archives serve as catalysts, igniting human imagination and inspiring captivating narratives. We hope that the experimental nature of this project serves as a landmark for further investigations into the ethical and creative use of emerging technologies and the ever-evolving relationship between software and human creativity.