A Blog by Jonathan Low

 

Jan 24, 2022

What Happens When AI Thinks It Knows How People Feel - And Acts On That

The science between AI-driven emotion detection is a lot less certain than its proponents would have people believe, especially across nationalities, cultures, races and geographic boundaries. 

Probably best not to outsource personal communications to an algorithm just yet. JL 

Will Coldwell reports in Wired:

Since the start of the pandemic, more relationships depend on computer-mediated channels. Amid online spats, toxic messages, and infinite Zoom, could algorithms help us be nicer? Can an app read our feelings better than we can? Or does outsourcing communications to AI chip away at what makes a human relationship human? Emotion-detecting AI is built on the idea that humans have universal expressions of emotions. Algorithms were 86% accurate at detecting conflict, able to generate a correlation with self-reported emotions. (But) "will it work in all cultures? We really don’t know.” IN MAY 2021, Twitter, a platform notorious for abuse and hot-headedness, rolled out a “prompts” feature that suggests users think twice before sending a tweet. The following month, Facebook announced AI “conflict alerts” for groups, so that admins can take action where there may be “contentious or unhealthy conversations taking place.” Email and messaging smart-replies finish billions of sentences for us every day. Amazon’s Halo, launched in 2020, is a fitness band that monitors the tone of your voice. Wellness is no longer just the tracking of a heartbeat or the counting of steps, but the way we come across to those around us. Algorithmic therapeutic tools are being developed to predict and prevent negative behavior. 
Jeff Hancock, a professor of communication at Stanford University, defines AI-mediated communication as when “an intelligent agent operates on behalf of a communicator by modifying, augmenting, or generating messages to accomplish communication goals.” This technology, he says, is already deployed at scale. 
Beneath it all is a burgeoning belief that our relationships are just a nudge away from perfection. Since the start of the pandemic, more of our relationships depend on computer-mediated channels. Amid a churning ocean of online spats, toxic Slack messages, and infinite Zoom, could algorithms help us be nicer to each other? Can an app read our feelings better than we can? Or does outsourcing our communications to AI chip away at what makes a human relationship human? 
YOU COULD SAY that Jai Kissoon grew up in the family court system. Or, at least, around it. His mother, Kathleen Kissoon, was a family law attorney, and when he was a teenager he’d hang out at her office in Minneapolis, Minnesota, and help collate documents. This was a time before “fancy copy machines,” and while Kissoon shuffled through the endless stacks of paper that flutter through the corridors of a law firm, he’d overhear stories about the many ways families could fall apart. 
In that sense, not much has changed for Kissoon, who is cofounder of OurFamilyWizard, a scheduling and communication tool for divorced and co-parenting couples that launched in 2001. It was Kathleen’s concept, while Jai developed the business plan, initially launching OurFamilyWizard as a website. It soon caught the attention of those working in the legal system, including Judge James Swenson, who ran a pilot program with the platform at the family court in Hennepin County, Minneapolis, in 2003. The project took 40 of what Kissoon says were the “most hardcore families,” set them up on the platform—and “they disappeared from the court system.” When someone eventually did end up in court—two years later—it was after a parent had stopped using it. 
Two decades on, OurFamilyWizard has been used by around a million people and gained court approval across the US. In 2015 it launched in the UK and a year later in Australia. It’s now in 75 countries; similar products include coParenter, Cozi, Amicable, and TalkingParents. Brian Karpf, secretary of the American Bar Association, Family Law Section, says that many lawyers now recommend co-parenting apps as standard practice, especially when they want to have a “chilling effect” on how a couple communicates. These apps can be a deterrent for harassment and their use in communications can be court-ordered.

 In a bid to encourage civility, AI has become an increasingly prominent feature. OurFamilyWizard has a “ToneMeter” function that uses sentiment analysis to monitor messages sent on the app— “something to give a yield sign,” says Kissoon. Sentiment analysis is a subset of natural language processing, the analysis of human speech. Trained on vast language databases, these algorithms break down text and score it for sentiment and emotion based on the words and phrases it contains. In the case of the ToneMeter, if an emotionally charged phrase is detected in a message, a set of signal-strength bars will go red and the problem words are flagged. “It’s your fault that we were late,” for example, could be flagged as “aggressive.” Other phrases could be flagged as being “humiliating” 

ToneMeter was originally used in the messaging service, but is now being coded for all points of exchange between parents in the app. Shane Helget, chief product officer, says that soon it will not only discourage negative communication, but encourage positive language too. He is gathering insights from a vast array of interactions with a view that the app could be used to proactively nudge parents to behave positively toward each other beyond regular conversations. There could be reminders to communicate schedules in advance, or offer to swap dates for birthdays or holidays—gestures that may not be required but could be well-received.

CoParenter, which launched in 2019, also uses sentiment analysis. Parents negotiate via text and a warning pops up if a message is too hostile—much like a human mediator might shush their client. If the system does not lead to an agreement, there is the option to bring a human into the chat.

Deferring to an app for such emotionally fraught negotiations is not without issues. Kissoon was conscious not to allow the ToneMeter to score parents on how positive or negative they seem, and Karpf says he has seen a definite effect on users’ behavior. “​​The communications become more robotic,” he says. “You’re now writing for an audience, right?”

Co-parenting apps might be able to help steer a problem relationship, but they can’t solve it. Sometimes, they can make it worse. Karpf says some parents weaponize the app and send “bait” messages to wind up their spouse and goad them into sending a problem message: “A jerk parent is always going to be a jerk parent”. Kisson recalls a conversation he had with a judge when he launched the pilot program. “The thing to remember about tools is that I can give you a screwdriver and you can fix a bunch of stuff with it,” the judge said. “Or you can go poke yourself in the eye.”

IN 2017, ADELA Timmons was a doctoral student in psychology undertaking a clinical internship at UC San Francisco and San Francisco General Hospital, where she worked with families that had young children from low-income backgrounds who had been exposed to trauma. While there, she noticed a pattern emerging: Patients would make progress in therapy only for it to be lost in the chaos of everyday life between sessions. She believed technology could “bridge the gap between the therapist’s room and the real world” and saw the potential for wearable tech that could intervene just at the moment a problem is unfolding.

In the field, this is a “Just in Time Adaptive Intervention.” In theory, it’s like having a therapist ready to whisper in your ear when an emotional alarm bell rings. “But to do this effectively,” says Timmons, now director of the Technological Interventions for Ecological Systems (TIES) Lab at Florida International University, “you have to sense behaviors of interest, or detect them remotely.”

Timmons’ research, which involves building computational models of human behavior, is focused on creating algorithms that can effectively predict behavior in couples and families. Initially she focused on couples. For one study, researchers wired up 34 young couples with wrist and chest monitors and tracked body temperature, heartbeat and perspiration. They also gave them smartphones that listened in on their conversations. By cross-referencing this data with hourly surveys in which the couples described their emotional state and any arguments they had, Timmons and her team developed models to determine when a couple had a high chance of fighting. Trigger factors would be a high heart rate, frequent use of words like “you,” and contextual elements, such as the time of day or the amount of light in a room. “There isn’t one single variable that counts as a strong indicator of an inevitable row,” Timmons explains (though driving in LA traffic was one major factor), “but when you have a lot of different pieces of information that are used in a model, in combination, you can get closer to having accuracy levels for an algorithm that would really work in the real world.”

Timmons is expanding on these models to look at family dynamics, with a focus on improving bonds between parents and children. TIES is developing mobile apps that aim to passively sense positive interactions using smartphones, Fitbits, and Apple Watches (the idea is that it should be workable with existing consumer technology). First, the data is collected—predominantly heart rate, tone of voice, and language. The hardware also senses physical activity and whether the parent and child are together or apart.

In the couples’ study, the algorithm was 86 percent accurate at detecting conflict and was able to generate a correlation with self-reported emotional states. In a family context, the hope is that by detecting these states the app will be able to actively intervene. “It might be a prompt, like ‘go give your child a hug’ or ‘tell your child something he or she did well today,’” says Timmons. “We’re also working on algorithms that can detect negative states and then send interventions to help the parent regulate their emotion. We know that when a parent’s emotion is regulated, things tend to go better.”

Contextual information helps improve prediction rates: Has the person slept well the night before? Have they exercised that day? Prompts could take the form of a suggestion to meditate, try a breathing exercise, or engage with some cognitive behavioral therapy techniques. Mindfulness apps already exist, but these rely on the user remembering to use it at a moment when they are likely to be angry, upset, or emotionally overwhelmed. “It’s actually in those moments where you’re least able to pull on your cognitive resources,” says Timmons. “The hope is that we can meet the person halfway by alerting them to the moment that they need to use those skills.” From her experience working with families, the traditional structure of therapy—50-minute sessions once a week—is not necessarily the most effective way to make an impact. “I think the field is starting to take more of an explicit interest in whether we can expand the science of psychological intervention.”

The work is supported by a grant from the National Institutes of Health and National Science Foundation as part of a fund to create technology systems that are commercially viable, and Timmons hopes the research will lead to psychological health care that is accessible, scalable, and sustainable. Once her lab has the data to prove it is effective and safe for families—and does not cause unexpected harm—then decisions will need to be made about how such technology could be deployed.

As data-driven health care expands, privacy is a concern. Apple is the latest major tech company to expand into this space; it is partway through a three-year study with researchers at UCLA, launched in 2020, to establish if iPhones and Apple Watches could detect—and, ultimately, predict and intervene in—cases of depression and mood disorders. Data will be collected from the iPhone’s camera and audio sensors, as well as the user’s movements and even the way they type on their device. Apple intends to protect user data by having the algorithm on the phone itself, with nothing sent to its servers.

At the TIES lab, Timmons says that no data is sold or shared, except in instances relating to harm or abuse. She believes it is important that the scientists developing these technologies think about possible misuses: “It’s the joint responsibility of the scientific community with lawmakers and the public to establish the acceptable limits and bounds within this space.”

The next step is to test the models in real time to see if they are effective and whether prompts from a mobile phone actually lead to meaningful behavioral change. “We have a lot of good reasons and theories to think that would be a really powerful mechanism of intervention,” Timmons says. “We just don’t yet know how well they work in the real world.”

An X-Ray for Relationships

THE IDEA THAT sensors and algorithms can make sense of the complexities of human interaction is not new. For relationship psychologist John Gottman, love has always been a numbers game. Since the 1970s, he has been trying to quantify and analyze the alchemy of relationships.

Gottman conducted studies on couples, most famously at the “Love Lab,” a research center at the University of Washington that he established in the 1980s. A version of the Love Lab still operates today at the Gottman Institute in Seattle, founded with his wife, Julie Gottman, a fellow psychologist, in 1996. In rom-com terms, the Love Lab is like the opening sequence of When Harry Met Sally spliced with the scene in Meet the Parents when Robert De Niro hooks his future son-in-law up to a lie detector test. People were wired up two by two and asked to talk between themselves—first about their relationship history, then about a conflict—while various pieces of machinery tracked their pulse, perspiration, tone of voice, and how much they fidgeted in their chair. In a back room filled with monitors, every facial expression was coded by trained operators. The Love Lab aimed to collect data on how couples interact and convey their feelings.

This research led to the “Gottman method,” a relationship-counseling methodology. It’s important to maintain a 5:1 ratio of positive to negative interactions; that a 33 percent failure to respond to a partner’s bid for attention equates to a “disaster”; and that eye-rolls are strongly correlated with marital doom. “Relationships aren’t that complicated,” John Gottman says, speaking from his home in Orcas Island, Washington.

The Gottmans too, are stepping into the AI realm. In 2018, they founded a startup, Affective Software, to create an online platform for relationship assessment and guidance. It started from an IRL interaction; a friendship that was sparked many years ago when Julie Gottman met Rafael Lisitsa, a Microsoft veteran, as they collected their daughters at the school gates. Lisitsa, the cofounder and CEO of Affective Software, is developing a virtual version of the Love Lab, in which couples can have the same “x-ray” diagnosis of their relationship delivered via the camera on their computer, iPhone, or tablet. Again, facial expressions and tone of voice are monitored, as well as heart rate. It’s an indicator of how far emotion detecting, or “affective computing” has come; though the original Love Lab was backed up by screens and devices, ultimately it took a specially trained individual to watch the monitor and correctly code each cue. Gottman never believed the human element could be removed. “There were very few people who can actually really sensitively code emotion,” he says. “They had to be musical. They had to have some experience with theatre … I never dreamed a machine would be able to do that.”

Not everyone is convinced that machines can do this. Emotion-detecting AI is choppy territory. It is largely built on the idea that humans have universal expressions of emotions—a theory developed in the 1960s and ’70s with observations by Paul Ekman, who created a facial expression coding system that informs the Gottmans’ work and forms the basis of much affective computing software. Some researchers, such as Northeastern University psychologist Lisa Feldman Barrett, have questioned whether it is possible to reliably detect emotion from a facial expression. And though already widely used, some facial recognition software has shown evidence of racial bias; one study that compared two mainstream programs found they assigned more negative emotions to Black faces than white ones. Gottman says the virtual Love Lab is trained on facial datasets that include all skin types and his system for coding interactions has been tested across different groups in the US, including African American and Asian American groups. “We know culture really does moderate the way people express or mask emotions,” he says. “We’ve looked in Australia, the UK, South Korea, and Turkey. And it seems like the specific affect system I’ve evolved really does work. Now, will it work in all cultures? We really don’t know.”

Gottman adds that the Love Lab really operates by means of a social coding system; by taking in the subject matter of the conversation, tone of voice, body language, and expressions, it is less focused on detecting a singular emotion in the moment and instead analyzes the overall qualities of an interaction. Put these together, says Gottman, and you can more reliably come up with a category like anger, sadness, disgust, contempt. When a couple takes part, they are invited to answer a detailed questionnaire, then record two 10-minute conversations. One is a discussion about the past week; the other is about a conflict. After uploading the videos the couple rate their emotional state during different stages of the conversation, from 1 (very negative) to 10 (very positive). The app then analyzes this, along with the detected cues, and provides results including a positive-to-negative ratio, a trust metric, and prevalence of the dreaded “Four Horsemen of the Apocalypse”’: criticism, defensiveness, contempt, and stonewalling. It is intended to be used in conjunction with a therapist.

Therapy and mental health services are increasingly provided through video calls—since the pandemic, this shift has been supercharged. Venture capital investment in virtual care and digital health has tripled since Covid-19, according to analysts at McKinsey, and AI therapy chatbots, such as Woebot, are going mainstream. Relationship counseling apps such as Lasting are already based on the Gottman method and send notifications to remind users to, for example, tell their partner that they love them. One could imagine this making us lazy, but the Gottmans see it as an educational process—arming us with tools that will eventually become second nature. The team is already thinking about a simplified version that could be used independently of a therapist.

For the Gottmans, who were inspired by the fact that so many couples are stuck on their smartphones anyway, technology opens up a way to democratize counseling. “People are becoming much more comfortable with technology as a language,” says Gottman. “And as a tool to improve their lives in all kinds of ways.”

Email for You, but Not by You

THIS TECHNOLOGY IS already everywhere. It could be impacting your relationships without you noticing. Take Gmail’s Smart Reply—those suggestions of how you may respond to an email—and Smart Compose, which offers to finish your sentences. Smart Reply was added as a mobile feature in 2015, Smart Compose rolled out in 2018; both are powered by neural networks.

Jess Hohenstein, a PhD researcher at Cornell University, first encountered Smart Reply when Google Allo, the now-defunct messaging app, was launched in 2016. It featured a virtual assistant that generated reply suggestions. She found it creepy: “I didn’t want some algorithm influencing my speaking patterns, but I thought this had to be having an effect.”

ADVERTISEMENT

In 2019, she ran studies that found that AI is indeed changing the way we interact and relate to each other. In one study using Google Allo, 113 college students were asked to complete a task with a partner where one, both, or neither of them were able to use Smart Reply. Afterwards, the participants were asked how much they attributed the success or failure of the task on the other person (or AI) in the conversation. A second study focused on linguistic effects; how people responded to positive or negative “smart” replies.

Hohenstein found that the language people used with Smart Reply skewed toward the positive. People were more likely to roll with a positive suggestion than a negative one—participants also often found themselves in a situation where they wanted to disagree, but were only offered expressions of agreement. The effect is to make a conversation go faster and more smoothly— Hohenstein noticed that it made people in the conversation feel better about one another too.

Hohenstein thinks that this could become counterproductive in professional relationships: This technology (combined with our own suggestibility) could discourage us from challenging someone, or disagreeing at all. In making our communication more efficient, AI could also drum our true feelings out of it, reducing exchanges to bouncing “love it!” and “sounds good!” back at each other. For people in the workplace who have traditionally found it harder to speak up, this could add to the disincentive to do so.

In the task-completion study, Hohenstein found that, the humans took credit for positive outcomes. When something went wrong, the AI was blamed. In doing so, the algorithm protected the human relationship and provided a buffer for our own failings. It raises a deeper question of transparency: should it be revealed that an AI has helped craft a response? When a partner was using Smart Reply, it initially made the receiver feel more positive about the other person. But when told that an AI was involved, they felt uncomfortable.

This underpins a paradox that runs through the use of such technology—perception and reality are not aligned. “People are creeped out by it, but it’s improving interpersonal perceptions of the people you’re communicating with,” says Hohenstein. “It’s counterintuitive.”

In his paper, Hancock highlights how these tools “may have widespread social impacts” and outlines a research agenda to address a technological revolution that has happened under our noses. AI-mediated communication could transform the way we speak, mitigate bias, or exacerbate it. It could leave us wondering who we’re really speaking to. It could even change our self-perception. “If AI modifies a sender’s messages to be more positive, more funny, or extroverted, will the sender’s self-perception shift towards being more positive, funny, or extroverted?” he writes. If AI takes over too much of our relationships, then what are we really left with?



0 comments:

Post a Comment