Dungeons & Dragons: Leading the Conversation through Intents and Theory-of-Mind

Researchers dive into the study of teacher-student natural language interactions to achieve a shared goal in the fantasy world game, Dungeons & Dragons.

by Brian Alvarado

July 14, 2023

Credit: The Wild Beyond the Witchlight, Dongeons and Dragons

Dungeons & Dragons (D&D) is a role-playing game set in a medieval fantasy world where players come together to accomplish specific tasks. The use of storytelling and imagination are concepts within the game that allow players to go on adventures and complete different goals.

Playing D&D goes beyond creativity and strategic thinking. It has given players the opportunity to have a sense of belonging in a community. An article by Underdog Games brings up the point: “The benefits of playing Dungeons and Dragons go deeper than just fun. D&D provides an opportunity for players to develop social-emotional skills, build confidence, and learn to express themselves.”

A group of researchers led by USC’s Information Sciences Institute Jay Pujara, Pei Zhou, and Xiang Ren work at improving AI’s human skills in the game.

“One interesting challenge is the idea that an AI agent has the capacity to play a game with you, even if many human skills are involved such as common-sense knowledge and how the world works,” said Pujara, research lead.

One of the biggest factors in this research surrounds knowledge, and how much of it can be implemented in these AI agents to make them the most useful as a Dungeon Master.

“Fusing common sense knowledge and knowledge on how the world works are some of the goals that we have for this work. We used numerous papers on dialogue agents as a guide to better help us understand the intents and conversations between these agents.”

The Dungeon Master

D&D comprises a Dungeon Master (DM) and a unique set of characters that each have a specific role within the game. The DM is the main narrator of the game – responsible for presenting the characters with unique situations through dialogue that must be interpreted. Whereas the characters have the objective of interpreting the dialogue and working together in order to complete the quests and survive.

The researchers wanted to dig deeper into how DM’s respond and interact with players and vice versa; therefore, sparking the central research question: Does incorporating intent and theory of mind (ToM) make computational models better communicators?

“We proposed this task (i.e., teacher-student interaction) where the scenario is goal-driven. We used Dungeons & Dragons as an environment to ground this task because we wanted to investigate whether we can train this teacher model that will guide students using their minds,” Zhou explained, a research assistant at ISI.

“Because when we communicate, the real goal is that each person should ideally converge on the same set of ideas. In D&D, we want a consistent world between us through communication, and to do that successfully we need to have a model of what is in your head.” Pujara added.

For the dialogue to be interpreted correctly by both the student and the teacher, “G4C: Generating Guidance in Goal-Driven and Grounded Communication” was proposed. A task consisting of intent, guidance, and utterance.

Intent is the action that the DM takes to guide players toward a specific goal – i.e., the DM wants the players to detect the presence of something. Guidance is the action the DM takes for the player to take a specific task- i.e., you hint that there are movements in the bushes, so a player makes the action look at the bushes. Utterances are the action of expression where the intent matches the anticipated player action.

“Players will take an action such as checking their surroundings based on what the DM had prompted them to do. Now, we raised the question ‘Can we train an AI model given an intent to generate an utterance in conversation so that players can fulfill that intent?’” Zhou added.

This is where a lot of the creativity and storytelling comes into play in D&D.

“If I just tell you to check on those bushes then it wouldn’t make a good game, so the storytelling comes into play to create an evocative scene and connect it to the action that I want the player to make,” Pujara said.

Theory of Mind (ToM)

The theory-of-mind (ToM) is the idea to understand that people act in a way that is motivated by their desires and beliefs. It allows us to understand that not all share the same thoughts and feelings. Specifically, this social cognitive skill allows one to understand other mental states – beliefs, intent, desires, emotions, and knowledge.

D&D presents scenarios where the AI not only needs to understand the goal, but it needs to have a plan to think about how its actions are going to lead to the success of accomplishing the goal.

Pujara said “One important part of this research is the architecture: we are trying to figure out what implicit knowledge is in a dialogue. We added the theory-of-mind to effectively figure out what one is thinking about.”

The general purpose of ToM is to understand what is going on in another individual’s head. Moving in a future direction, this is an important element to consider when trying to accomplish difficult tasks (i.e., an individual with a perceptual disability trying to navigate around the home).

This is important in all human communication because these agents are able to read between the lines and understand the implications of tasks within the game.

The Challenges

D&D is undoubtedly a complicated game that requires memorization and mastery of the rules. Thus, coming across challenges in the data was inevitable in the study.

Pujara said “One of the biggest challenges is that you first need to expose your system to the D&D data. We collected a great number of transcripts of people playing online. However, we all had to dissect that since there was no data that highlighted the intent and ToM.”

The biggest focus in the data collection is gathering the key phrases and sentences that make a player complete an action in the game. Pujara mentioned that this work was inspired by reinforcement language – a robotic term that requires learning behavior in an environment to obtain a reward (i.e., moving a robotic arm to a particular point and figuring out the start/end path for it).

“We sort of did the same thing for D&D but worked backward. We knew what the person did and figured out what the DM said to get them to do it.” Pujara added.

“We asked ourselves: What are the possibilities? How do you get a good evaluation? What makes a good D&D session? – all that was hard because nobody gives you the data perfectly.” Pujara said.

These challenges are seen across the entire landscape of AI and how it is rapidly transforming. Pujara brought up the idea that a future direction in the research is trying to figure out how they can have different models that possess diverse skill sets such as emotional and logistical knowledge.

Star Dungeon Master Model

The study presented the researchers with a good amount of time brainstorming about what makes a good Dungeon Master. In general, a good DM is one that is familiar with the rules and references of the game; as well as setting the right level of difficulty.

“We have a metric called the star DM model which is actually going to have different capabilities of a human – fluency, groundedness, and guidance.”

Fluency is the important action of making the DM sound humanlike and convincing. Groundeness is the action of making sense as it cannot talk about things that are unrelated to the medieval fantasy world. Guidance is the action of helping others reach their goals and complete the tasks that are originally intended.

“It wasn’t as simple as a thumbs up/thumbs down scenario. We took everything together to checklist what makes a good DM. Maybe in the future, we can present a 20-hour campaign of D&D to gather more data to make an even better DM.” Pujara added.

The combination of building a Dungeons & Dragons Dungeon Master is an exciting opportunity for researchers as it is a fun way to solve hard AI problems and explore the spectrum of capabilities needed to function in the world.

ACL 2023

USC Viterbi Information Sciences Institute (ISI)’s Pei Zhou, a Research Assistant at ISI, Andrew Zhu, a Computer Engineering graduate from the University of Pennsylvania, Jennifer Hu, a Research Fellow at the Harvard Kempner Institute, Jay Pujara, a Research Lead at ISI, Xiang Ren, a Research Lead at ISI, Chris Callison-Burch, Associate Professor of Computer and Information Science at the University of Pennsylvania, Yejin Choi, Professor of Computer Science at the University of Washington, and Prithviraj Ammanabrolu, a Researcher at the Allen Institute for Artificial Intelligence are presenting a paper on I Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons at the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23).

ACL is the premier scientific and professional society for people working on computational problems involving human language, a field often referred to as either computational linguistics or natural language processing.

The event takes place in Toronto, Canada from July 9th to July 14th, 2023.

Published on July 14th, 2023

Last updated on July 14th, 2023