Where we’ve come from
I created Humanitas et Machina earlier this year with the objective of exploring how AI might help us form hopeful visions of a future in which we solve some of the biggest challenges that humanity faces in the 21st Century. I also wanted to learn about this rapidly evolving technology, its boundaries and biases in a broadly controlled environment as explained on the About page.
The first series consisted of seven stories including the introductory story, Beginnings, to set the scene for the project. The series covered six significant threats facing humanity, which were Nuclear War, Mental Health, Climate Change, Pandemic, Asteroid Collision and AI. I selected these somewhat arbitrarily from an even bigger list and intend to explore other topics in a future series.
The generative AI tools used were ChatGPT-4 and Claude for writing and Midjourney to create the supporting images.
As a human-machine collaboration, I didn’t know the extent to which the AI would do the work versus myself as a human. It turned out that I needed to invest a lot more time than I anticipated and it was in the end a true collaboration. The approximate balance between myself and AI in the writing process was as follows.
Story concept: 20% AI vs 80% human
Copywriting: 80% AI vs 20% human
Editing: 5% AI vs 95% human
I’ll now share some more detailed insights into the experience of using these tools.
Write me a story about…
I initially kept the prompts requesting the AI tools to write these stories as simple as possible in order to see what they could do with minimal guidance. My first impressions were very positive; I was surprised how well my briefs were interpreted and found the writing to be broadly engaging, interesting and emotive, even echoing the style of Émile Zola as I had requested.
Having said that, the early stories were not good enough for publication. At first glance they appeared to be well written, but closer inspection revealed a number of issues that needed to be resolved.
The biggest issue was that the stories tended to be somewhat formulaic in terms of plotlines, language and characters. This might not have been obvious if I had only generated one story, but the process of generating a series of stories including some that didn't make it to publication, revealed patterns that are clear when you put them all together as a collection. I attempted to change the framing of the briefs for different stories in order to spark more variation but it still resulted in some of the same patterns appearing again and again. Most bizarrely, I found that Claude seemed to generate very similar plot lines to ChatGPT, even when given relatively open briefs and encouraged to be creative.
It appears that these tools have a way of modelling the formulas of certain types of story and then delivering variations of these models rather than being truly creative. This was most obvious in the naming of stories and characters, where very similar names were used again and again, eventually leading me to change many of them manually. For example, Elijah, Eliza, Elisabetta, and Elias were included as character names in the first few stories.
Another side of this seemingly formulaic approach to generating responses was that I found that both ChatGPT and Claude don't always follow instructions. They give you roughly what you've asked for, but only if convenient. More subtle or complex requirements of a brief would often be ignored and any requirements regarding the length of a story in terms of word count would be completely ignored. This ties in with some of my other experiences using these tools in which I found their ability to understand numbers very limited. Similarly, I found that Claude was quite good at critiquing work, but when asked to rewrite the same text to incorporate its own feedback, it would write completely new text that was often worse than the original. It's clear to me that these tools don't have any true understanding of what they are saying and are in fact exceptionally good blagging machines.
In terms of the writing style, I found ChatGPT to be more elegant and capable at creative writing tasks, even if it often tried too hard to use poetic language that in many cases I had to manually dial down. Whether I dialled it down enough is for you to decide. As mentioned, I had specified a writing style reminiscent of Émile Zola, the 19th Century French novelist whose short stories I am fond of. Although neither Claude nor ChatGPT wrote exactly like Zola, I found that ChatGPT did a good job of creating a similar style that felt natural, whereas Claude’s attempts felt awkward and forced. This quality of writing style was the main reason that I used ChatGPT as the primary tool.
One of my concerns with generative AI tools is bias, and I did find some bias in the stories to do with character stereotypes, which led me to manually include some basic details about characters, such as ethnicity, gender and age in the story briefs in an attempt to make the character profiles more diverse and balanced. Unexpectedly, this was not the most obvious area of bias. Instead that turned out to be geopolitical bias, with several stories initially including overt East versus West narratives, with Russia and China being presented as the “axis of evil” to be overcome by the benevolent saints of the West, led by the United States. I initially tried highlighting this and asked ChatGPT to make adjustments, but this was somewhat unsuccessful and so instead I became more prescriptive in the briefs themselves, providing some outline geographical context of each story.
Now we come on to the biggest challenge for me, and the biggest disappointment. A big part of the motivation for creating Humanitas et Machina was to create stories of hope to help inspire us and shift our focus towards positive visions of the future, contrasting the constant flow of doom and business as usual that we’re fed through the media. I hoped that it might actually propose some solutions for us to aim for that we have until now overlooked.
However, the solutions proposed in early versions of the stories were sadly very empty and simplistic. Each story offered very watery solutions in which the main character did something such as create some art or give an inspiring speech and then the world's problems were magically solved and everyone lived happily ever after.
The AI tools did seem to have a pretty good understanding of the issues raised in each story such as pandemics, asteroids, climate change, and of course, AI, but they had little to offer in the way of meaningful solutions. Again, I think this reflects the fact that the current tools are not truly intelligent or creative and are simply very good at pretending.
The result was that I had to give a lot of direction in the writing process to form stories that felt complete, coherent and inspiring. This was actually quite interesting for me as it forced me to think more deeply about these topics, but it also missed the mark. I think in some ways that is a good thing though. After all, one of the risks of AI is that we become overly reliant and lose our ability to think for ourselves. So in some ways, these current limitations are as much a blessing as a curse.
Paint me a picture of…
While this project is primarily about the writing, I thought it would also be a good opportunity to try AI imagery and add some visual interest to the stories. Having tested DALL-E 2, Stable Diffusion and Midjourney earlier in the year, I found the image quality in Midjourney to be head and shoulders above the others and so I used it for all images in this project.
Although Midjourney proved to be a highly impressive tool, it was apparent that it doesn't actually understand the commands given to it, nor does it understand the images it is creating. It seems to be blindly running keywords through an algorithm and hoping for the best, sometimes getting it on point and sometimes missing the mark by some margin. As a result, the process sometimes felt highly uncontrollable. You might not notice this if you're just trying to generate some general images for fun, but when trying to meet a specific brief to create a series of coherent images that feel like they sit together as a family, it becomes very obvious.
I completely gave up on dome image briefs, such as trying to create an image of the Schlossberg clock tower in Graz for Echoes of the Heart, which it was unable to do even when I gave it a photograph to work from (as shown below). Similarly when generating an image of a woman looking up at an asteroid in the sky for Celestial Rebirth, it would create images of a woman with crystal balls floating around her (as shown above). Very strange indeed!
To sum up
I've enjoyed this project so far and found these generative AI tools to be both impressive and full of limitations. It’s been a good learning exercise so far and I hope that the stories themselves have been enjoyable to read and provided some sparks of inspiration, even if they haven’t provided the solutions to the world’s problems.
My experience so far has highlighted that these tools do currently need human collaboration for this type of application and that despite the hype, they're not yet truly intelligent or creative. I'm curious to see how generative AI tools evolve in 2024 and time permitting, my intention is to produce a second series when new tools are available so that we can understand the path of evolution and hopefully generate some more powerful stories of hope along the way. Stay subscribed for updates!