The most notable aspect of the film Sunsprising involves its creation: an artificial-intelligence bot wrote the screenplay. Of course, it raises some interrogations about the capabilities of machine learning in artistry. But we understand quickly that the dialogue often sounds like a random series of strange sentences. Do we really think machines can replace writers – or maybe just assist them? As always and still for a while, human storytellers would be the brains creating a screenplay with clever plot twists and breakthrough dialogue. AI would enhance their work by providing insights that increase a story’s emotional pull—for instance, identifying a musical score or visual image that helps engender feelings of hope. This breakthrough technology would supercharge storytellers, helping them thrive in a world of seemingly infinite audience demand.
The Massachusetts Institute of Technology (MIT) Media Lab recently investigated the potential for such machine–human collaboration in video storytelling. Was it possible that machines could identify common emotional arcs in video stories—the typical swings of fortune that have characters struggling through difficult times, triumphing over hardship, falling from grace, or declaring victory over evil? If so, could storytellers use this information to predict how audiences might respond?
Before getting into the research, let’s talk about emotional arcs. Great storytellers —from Sendak to Spielberg to Proust to Pixar—are skilled at exciting our emotions. With an instinctive read of our pulse, they tune their story to provoke joy, sadness, and anger at crucial moments. But even the best storytellers can deliver uneven results. What accounts for this variability? We theorize that a story’s emotional arc largely explains why some movies earn accolades and others fall flat. The idea of emotional arcs isn’t new. Every storytelling master is familiar with them, and some have tried to identify the most common patterns. The most popular arc follows the pattern found in Cinderella. As the story begins, the main character is in a desperate situation. That’s followed by a sudden improvement in fortune—in Cinderella’s case provided by a fairy godmother—before further troubles ensue. No matter what happens, Cinderella-type stories end on a triumphant note, with the hero or heroine living happily ever after.
There’s evidence that a story’s emotional arc can influence audience engagement—how much people comment on a video on social media, for example, or praise it to their friends. In an University of Pennsylvania study, researchers reviewed New York Times articles to see if particular types were more likely to make the publication’s most emailed list. They found that readers most commonly shared stories that elicited a strong emotional response, especially those that encouraged positive feelings. It’s logical to think that moviegoers might respond the same way.
Some researchers have already used machine learning to identify emotional arcs in stories. One method, developed at the University of Vermont, involved having computers scan text—video scripts or book content—to construct arcs.
These models consider all aspects of a video—not just the plot, characters, and dialogue but also more subtle touches, like a close-up of a person’s face or a snippet of music that plays during a car-chase scene. When the content of each slice is considered in total, the story’s emotional arc emerges.
You can see the high and low points of the montage in the graph. The x-axis is time, measured in minutes, and the y-axis is visual valence, or the extent to which images elicit positive or negative emotions at that particular time, as scored by the machine. The higher the score, the more positive the emotion. As with all our analyses, we also created similar graphs for a machine’s responses to audio and to the video as a whole. We’re focusing on the visual graphs, here and elsewhere, since that was the focus of our later analyses of emotional engagement.
MIT’s machine-learning models have already reviewed thousands of videos and constructed emotional arcs for each one. To measure their accuracy, we asked volunteers to annotate movie clips with various emotional labels. What’s more, the volunteers had to identify which video element—such as dialogue, music, or images—triggered their response. Most stories could be classified into a relatively small number of groups, just as Vonnegut and other storytellers suspected. Exhibit 2 shows that the arcs that emerge with the videos in the Vimeo data set are clustered into five families.
Seeing how stories take shape is interesting, but it’s more important to understand how we can use these findings. Does a story’s arc, or the family of arcs to which it belongs, determine how audiences will respond to a video? Do stories with certain arcs predictably stimulate greater engagement? There is still a lot of questions to unveil in this quest for the perfect story.
(Adaptation from the article ‘Ai and storytelling: machines as co-creators’ from Jonathan Dunn, Geoffrey Sands, Eric Chu, Deb Roy and Russel Stevens for McKinsey studies)