Art directing the machine: How can brands and creatives use Gen AI to curate better imagery?
Ben Clark
Lead Designer

Until relatively recently, if you said ‘artificial intelligence’ to most people, they would probably think of something related to the plot of a futuristic, possibly dystopian, blockbuster film – or maybe a really smart chess computer.

That’s all changed in the last couple of years, with companies such as OpenAI and Stability AI releasing various tools that have opened-up the world of AI to the masses. Most attention has been drawn to text-prompt models, where the user simply types in – in everyday language – what they want to see. The resulting output is, in simple terms, either text based, or – more intriguingly – image based.

The best known ‘text-to-image’ models – DALL-E, Midjourney and Stable Diffusion – offer users the opportunity to generate never-seen-before imagery, and can respond to a vast array of input prompts, including subject, style, colour, artistic movement, filters and production techniques.

While the results of the first publicly available text-to-image models – which launched in 2021 – were often a bit odd, otherworldly and more of a curiosity than a useful tool, the technology has since significantly improved and expanded rapidly. Alongside tools that can generate whole entirely new images, they are now tools that can expand or remove backgrounds, retouch or remove elements, relight compositions, and even entirely repose and reposition elements.

This expansion of tools means AI is now a serious consideration for all creative industries. No longer a gimmick or simply just a fun thing to play with, both creatives and the brands they work with must work out how to navigate this changing landscape and harness AI for their own needs – or risk getting left behind.

Almost all creative industries could – or are going to – be affected by the continuing rise of AI in some way or another. To explore all the considerations we must take when working with AI tools, it might make sense to focus on one creative industry in particular – in this case, the not entirely sexy world of stock imagery.

Stock imagery has its roots in journalism and dates back to the 1920s, but the industry has grown rapidly in the last 20 years with the rise of lower cost ‘microstock’ and ‘midstock’ sites such as Shutterstock, iStock and Getty Images. These sites provide access to literally hundreds of millions of images on massive array of subjects, giving creatives and brands the opportunity to feature high quality imagery in design and marketing materials, without the costs and time required to commission a photoshoot.

Although this ease of access is great, working with stock imagery has always created challenges, particularly around ownability, distinctiveness and specificity. To ensure their images do actually sell, stock photographers tend to work in broader concepts, making their images suitable for lots of different use cases.

Whilst you’re definitely going to be able to find – to pick a random example – a well shot image of a middle-aged couple in a kitchen cooking dinner, you’re probably going to be stumped if you want to make sure they are definitely in a kitchen with green cabinets, cooking spaghetti alle vongole. Even if you could, by chance, find exactly the sort of image you had in mind, there is often not much you can do to ensure someone else isn’t already using that image or is going to in the future.

AI models can – and almost definitely will – change this. Text-to-image prompts will allow brands and creatives to generate images with the exact specifics of what they need – including setting, people and activity. And given AI imagery is always generated entirely from scratch, rather than pulling from a library of assets, the result is always going to be unique.

The technology isn’t in a place where it’s easily usable just yet – faces and hands are still a big challenge for most AI models, and images are often limited in resolution – but given the rapid improvement of the technology it would seem unwise to bet against it. The stock sites would seem to agree  – Shutterstock recently introduced an ‘AI Generated’ option in its main search bar.

AI tools could also help creatives and brands create images that more accurately represent and target specific demographics. While some more general AI image models do struggle with accurately depicting faces, there are some – like this one used by the NYTimes – which have been trained exclusively on faces. This means they can generate entirely new, never-seen-before human faces – and can match an exact specification of ethnicity, age, eye colour, hair colour. Imagine a world where you could not just tailor images to exactly the demographic you want to sell to – but you could generate variations based on numerous different demographics and combine this with customer segmentation tools to create hyper targeted marketing materials.

All this opportunity does of course come with its pitfalls and challenges. Our lovely clients aren’t to know this, but creatives do occasionally like to play a little game called ‘compare the bizarre image request’. From ‘an elephant using a mobile phone’, through ‘employees literally driving an office’ to ‘skydiving pensioners reading a book’ – we’ve heard it all. These requests are always diverted in a constructive way, and as creatives we will always try to help you figure what you are trying to communicate, and then curate or create some sort of concept that best answers that. It isn’t that the requests are bizarre because the images don’t exist – the images don’t exist because it would be a bizarre thing to look at.

The danger here is that if AI tools allow you to create literally anything you can think of, how do creatives and brand ensure that curation and good understanding of design and communication principles remain? This is why it is important for creatives to get ahead of AI tools. This isn’t just about making sure we still have jobs (which is a different discussion) or gatekeeping creative tools, but is about ensuring we know how to best guide our clients to good results using AI.

There is also brand consistency to consider. Stock imagery can already create issues here. If – as a brand – you can access hundreds of thousands of images related to your industry or your customer base, it can sometimes be challenging to effectively curate a set of brand imagery that feels consistent, harmonious and in-line with your brand positioning and design direction.

AI will only exacerbate this challenge. Though images can be more specific, that specificity requires curated input, so every image generated is like a mini photo-brief. Brands need to be even clearer on what an on-brand image looks and feels like, and creatives need to be ready to guide clients on how to art-direct AI. There is also the issue of volume. If you can generate a brand-new image for every single social post, campaign, ad, web page or other piece of marketing material, it will be very appealing to do just that. Everyone loves novelty, but this approach could easily lead to brands having thousands of images in use, increasing the challenges around consistency and harmony exponentially.

Of course, all challenges can also lead to opportunities – but until we start experiencing these challenges on a day-to-day basis it’s hard to predict what AI best practices will look like.

What is certainly clear is that no one has the answers yet. Despite how fun and exciting the current crop of tools are, AI is very much still in its infancy. Having said that, AI has already proven its potential for creative applications and has shown just how disruptive it is likely to be. That’s why we at Collective are embracing AI tools and looking for ways to integrate them into our workflow. We will also be attending the first-ever Creative Generative conference in London this September to explore this exciting topic with others.