Teaching a Computer to Win Human Friends and Influence People

By Gordy Slack

Marilyn Walker wants to give her clients a rhetorical edge, that je ne sais quoi that can make the difference between closing a sale and letting it slip away, between convincing a patient to take his meds and leaving him hopeless and unmotivated. There are lots of consultants who promise to help cultivate such skills, who will teach their clients how firmly to shake a hand, how long (or whether) to look a customer in the eye, how enthusiastically to say “please” and when to say “Just do it!” But Walker has a notoriously awkward breed of client who has mostly been neglected by motivational speakers; they are virtual people.

For the past ten years, most recently in her new professorship at UC Santa Cruz’s Baskin School of Engineering, Walker has been looking hard at how humans express themselves, not just at what they say–but also at how they say it–in an effort to develop algorithms that will enable computer animations to employ those same techniques to express themselves more richly and compellingly.

Walker’s field, known roughly as computational linguistics, is a hybrid of computer science, linguistics and social psychology. The work requires her to understand what goes on when real people talk to each other, and for her to be able to replicate that in digital personae. Gaining that knowledge is tricky, because so much of communication is unconscious. How you phrase a sentence, whether you repeat a word, whether your sentence ends on an up-note or trails downward, all effect how your interlocutor will interpret what you say. And while simply paying attention to what is said is complex enough, it is only a fraction of the picture, says Walker. She is now collaborating with UC Davis professor Michael Neff, who is adding body language to the equation. Depending on what kind of communication is occurring, body position can be at least as important as the way an utterance is phrased, says Walker. And then there’s the all-important effect of facial expression, which is also key to communicating the emotional content of a conversation.

What happens when you add all three ingredients together? Well, no one really knows. Walker and her colleagues are currently studying whether you can amplify, or mitigate, the effect of one kind of an utterance by adding contradictory posture and facial movements. Whereas the words a person uses may be like a steering wheel, allowing the speaker to guide the conversation, facial expressions and body language can work like turn signals, and even accelerators and brakes. These more nuanced modes of communicating have been underexploited in digital interlocutors, says Walker, partly because a lot of that kind of communication goes on below the waterline of consciousness.

Walker borrows from a popular theory of personality in social psychology known as The Big Five Theory, which looks at each person’s place on the spectrum of five different traits: openness, conscientiousness, extroversion, agreeableness, and neuroticism (or emotional stability). These traits have been documented in many studies to be manifested through all three expressive modes: voice, face, and body. The ability to combine different factors from different modes broadly expands the palate an artist or programmer has to work with in designing an automated interlocutor. Suddenly you can express passive aggression (by combining tentative speech style and aggressive body language,) or ambivalence, or anger driven by concern.

Walker is conducting experiments with animated figures that can express any of these traits in voice, facial expression, or body language. In an online study, subjects witness and rate animated characters that are expressing different combinations of the Big Five traits. In one run through, a butler-like character named Alfred may look directly at you and tell you that he thinks a restaurant is "reasonably priced with good food." In another case, he may say something similar, but avert his eyes, or dart them back and forth. In a different test, body posture is added. Subjects evaluate Alfred’s personality as he gives brief reviews while folding his arms and leaning back, or while putting his hands on his hips.

Walker’s long-term aim is to compile a core program that will allow developers of all kinds of applications to imbue their animated characters with “dynamic adaptation,” or the ability to adapt automatically to different personality types and employ a wide range of reactions while having a conversation. The computer will evaluate whom they are talking to, and, depending on the objective of the conversation, adopt a conversational style, employing only some or all modes of expression, to best suit their needs, or the needs of the humans they’re talking to.

The overall objective, Walker says, is to create artificial interlocutors that are easier to work with, more pleasant, more natural, and more effective. Her algorithms will have powerful applications in areas such as computer games, interactive story systems, intelligent tutors and gaming as well. Not all of them will be good guys. In a game for example, it’s as important to make characters annoying and manipulative as it is to make them nice or attractive. If you are designing a digital interviewer to take medical histories, on the other hand, you want them to instill trust and confidence and to encourage candor. Their affect and body language, if you know how to use them, can help a lot toward those ends.

Walker is testing the waters with a cell-phone-based mobile social game that will help teenagers to exercise more, eat better food, and to lose weight. “Teenagers aren’t driven by logic,” she says. “They are driven by affect.” The program she is developing is called MATES (Mobile Agents for Training and Education) and the psychological premise is simple: “A motivational program for teenagers has to be fun and natural and not a chore. And to do that you must build a connection between the user and the virtual coach or peer.” The computerized character should be able to pull off what few parents can: be cool.

The digital characters have one advantage over human counterparts; they are unbound by psychological hang-ups and can be free to try subtly–or radically–different strategies until they find one that works. The most obvious strategy for a program that wants to be liked and listened to, says Walker, is called “similarity adoption”: essentially, the computer reflects the user’s personality. But that doesn’t always work. Not all personalities are looking for role models who reflect themselves. For example, says Walker, if the user is an extroverted narcissist, the most effective program isn’t going to be one that expresses those same qualities. It may be more effective to address an extrovert by being submissive and contrite. MATES can try a strategy and monitor the results: if the teenager exercises more (measured by an accelerometer and GPS in the phone) that would reinforce the approach. If she ignores the advice, it may be time to try something new, and the program will be well equipped to put on a new face and attitude, perhaps a little tough love with a hint of sarcasm.