Arts + Culture

Stories, storytellers and statistics: A computational approach to the humanities

Tim Tangherlini

Tim Tangherlini, a scholar of Danish folklore at UCLA, has used big-data analysis to discover surprising connections among stories and storytellers. His interest in "computational folkloristics" has become a recurring theme in his work.

When Tim Tangherlini, a scholar of Danish folklore at UCLA, was doing graduate work at UC Berkeley, he conceived of a project that focused on the automated analysis of Nordic folklore.

His dissertation relied on statistical methods that would make possible the recognition of patterns in these folk stories. His approach — to analyze 1,500 stories told by 125 storytellers — was revolutionary at the time in that it looked at folklore as a collection of related stories rather than as isolated, disconnected ones.

And it was revolutionary in another sense — few humanities scholars in the 1980s were thinking about computational approaches to humanistic material. It wasn’t that Tangherlini’s approach was poorly received, but that it wasn’t received at all.

“In my dissertation, I was really interested in what I called ‘trends’ in tradition, trying to figure out which topics were trending among which segments of the population,” Tangherlini said, adding with a smile, “I guess I should have copyrighted the term.” But at the time, he said, “I was more concerned about ever finding a job using statistics to discover patterns in large, poorly understood corpora of everyday interactions and stories. How things change!"  

A webpage from Tangherlini's Danish Folklore Data Nexus

But he never lost sight of the idea that big-data analytics could be a valuable tool. He put the project aside while exploring other aspects of the material to gain a better understanding of its depth.

Fifteen years later, computing tools had finally caught up to the demands of the project. So in the early 2000s, he began leveraging statistical and algorithmic methods to study a large quantity of cultural texts.

Tangherlini’s continued research in this arena has enabled a large collection of folklore to be seamlessly organized into a database and categorized by topic, storyteller and location.  

He calls this new approach “computational folkloristics.”And his interest in pursuing it has become a recurring theme in his work ever since.

A childhood interest

Tangherlini’s fascination with stories and storytelling was inspired by his grandmother’s sister, a raconteur who often entertained him with the Danish fairytales she knew from her childhood growing up in late 19th century Denmark.

In high school and his early college years, he thought about majoring in geography and becoming a cartographer. Much to his dismay, however, Harvard University, where he enrolled as an undergraduate, did not have a geography department, and so he chose to focus on computer science/applied math.

His aspiration to become a mathematician didn’t last long. “I realized very quickly that applied math was really hard,” he said, chuckling. Instead, he discovered he enjoyed a very different subject all together. “I found some solace in classes in folklore. I saw it as the history and literature of the common man.”

Since graduating from Harvard, Tangherlini has published on folklore, literature, film and critical geography while his main areas of interest include folk narrative, legend and popular culture.

His interest in geography spans two vastly different areas: the Nordic region and Korea. Since his mother is a Dane, Tangherlini grew up speaking Danish and pursued this interest in graduate school where he delved into medieval studies. After his second year at UC Berkeley, Tangherlini’s inquisitive mind prompted him to apply to a program that sent him to Korea — a program in which one of the main qualifying criteria was, oddly enough, that the applicant know nothing about Asia. While in Korea, Tangherlini studied shamanism and the political aspects of Korean folk culture at the National Folklore Museum.

An interdisciplinary perspective

Today, as a humanist working at UCLA on interdisciplinary projects, Tangherlini exchanges interesting approaches, data and problems  with computer scientists and mathematicians at UCLA. The folklore scholar said he appreciates the support he’s received from the university as a whole and from the Office of the Vice Chancellor for Research, the Institute for Digital Research and Education, and the Institute for Pure and Applied Mathematics in particular.

“The whole research infrastructure has been pushing this type of ‘Let’s take advantage of the power that is in the intellectual possibilities here on campus,’” he said.

Once he started thinking from an interdisciplinary perspective, he was able to solve problems across various disciplines and find answers he wouldn’t have been able to otherwise.

His proudest accomplishment, “Danish Folktales, Legends, and Other Stories,” (University of Washington Press, 2013) is a vast collection of Danish folklore that is accompanied by a data portal, the Danish Folklore Nexus, which allows users to access the richest collection of Danish folklore available in the English language thus far.

In recent years, enhancing this software has been his main focus. Tapping into its power, individuals can see all the places associated with a particular story, enhanced by historical maps and aerial views. Stories are available both in English and in the original Danish, with annotations, bibliographic references and even images of the actual manuscripts.

Tangherlini says one of his happiest moments was when a student walked into class bleary-eyed, saying, “I hate you. I opened your website, and all I’ve been doing is reading stories.” This is exactly what the scholar wants — for users to get lost in the collection and make new discoveries just as he did.

Computational methods of analyzing folklore always raise new questions. “It’s a publication-generating machine,” he said, laughing.

Surprising discoveries

The robust software has allowed Tangherlini to make many important connections. As he began to look at places associated with certain types of stories, statistically significant evidence pointed to a “hotspot” in Denmark where stories about witches had echoed for centuries.

He found that most of the stories failed to mention that a Catholic monastery, later associated with witchcraft, was located where the last witch-burning took place. That explained why these kinds of stories were highly concentrated in that area, he said.

The software he has developed, which he conceptualizes as the “Folklore Macroscope,” has also allowed him to explore the storytellers’ engagement with their local environment and to discover significant patterns. For example, his findings showed that the places that male storytellers mentioned in their stories tended to define an axis between the market towns where they sold goods, while women tended to stay close to the farm and neighborhood — reflecting the gender roles in place at the time.

Modern storytelling on the Web

Today, Tangherlini is studying storytelling in the context of the age of the Internet. He is looking at how contemporary stories circulate there and in the media and how those stories may influence people’s decision-making. Working alongside an electrical engineering professor and a professor from the Fielding School of Public Health at UCLA, Tangherlini is analyzing several forums where mothers are engaged in a debate about vaccinations. By aggregating their stories and opinions on vaccinations using computational methods, Tangherlini hopes to discover how differences in opinions circulate and influence conversations and behaviors.

“I want to see if we can get this to scale and address issues that are of great concern to different communities in the 21st century,” he said.

To see the original story, go to the Institute for Digital Research and Education website.

Media Contact