The implication is that organization's efforts to anthropomorphize algorithms in order to increase consumers' comfort with them may be unnecessary. The effort and resources invested in 'humanizing' algorithmic interaction might more productively be focused on improving effectiveness and speed. JL
Jennifer Logg and colleagues report in Harvard Business Review:
People are often comfortable accepting guidance from algorithms, and even trust them more than other people. That is not to say customers don’t appreciate “the human touch;" but that it may not be necessary to invest in the human element. People do not dislike algorithms as much as prior scholarship might have us believe; people relied more on the same advice when it came from an algorithm than from other people. (But) people appreciate algorithms more when choosing between an algorithm and someone else than when choosing between an algorithm’s judgment and their own.
Many companies have jumped on the “big data” bandwagon. They’re hiring data scientists, mining employee and customer data for insights, and creating algorithms to optimize their recommendations. Yet, these same companies often assume that customers are wary of their algorithms — and they go to great lengths to hide or humanize them.
For example, Stitch Fix, the online shopping subscription service that combines human and algorithmic judgment, highlights the human touch of their service in their marketing. The website explains that for each customer, a “stylist will curate 5 pieces [of clothing].” It refers to its service as “your partner in personal style” and “your new personal stylist” and describes its recommendations as “personalized” and “handpicked.” To top it off, a note from your stylist accompanies each shipment of clothes. Nowhere on the website can you find the term “data-driven,” even though Stitch Fix is known for its data science approach and is often called the “Netflix of fashion.”
It seems that the more companies expect users to engage with their product or service, the more they anthropomorphize their algorithms. Consider how companies name their virtual assistants like Siri and Alexa. And how the creators of Jibo, “the world’s first social robot,” designed an unabashedly adorable piece of plastic that laughs, sings, has one cute blinking eye, and moves in a way that mimics dancing.
But is it good practice for companies to mask their algorithms in this way? Are marketing dollars well-spent creating names for Alexa and facial features for Jibo? Why are we so sure that people are put off by algorithms and their advice? Our recent research questioned this assumption.
The power of algorithms
First, a bit of background. Since the 1950s, researchers have documented the many types of predictions in which algorithms outperform humans. Algorithms beat doctors and pathologists in predicting the survival of cancer patients, occurrence of heart attacks, and severity of diseases. Algorithms predict recidivism of parolees better than parole boards. And they predict whether a business will go bankrupt better than loan officers.
According to anecdotes in a classic book on the accuracy of algorithms, many of these earliest findings were met with skepticism. Experts in the 1950s were reluctant to believe that a simple mathematical calculation could outperform their own professional judgment. This skepticism persisted, and morphed into the received wisdom that people will not trust and use advice from an algorithm. That’s one reason why so many articles today still advise business leaders on how to overcome aversion to algorithms.
Do we still see distrust of algorithms today?
In our recent research, we found that people do not dislike algorithms as much as prior scholarship might have us believe. In fact, people show “algorithm appreciation” and rely more on the same advice when they think it comes from an algorithmic a person. Across six studies, we asked representative samples of 1,260 online participants in the U.S. to make a variety of predictions. For example, we asked some people to forecast the occurrence of business and geopolitical events (e.g., the probability of North America or the EU imposing sanctions on a country in response to cyber attacks); we asked others to predict the rank of songs on the Billboard Hot 100; and we had one group of participants play online matchmaker (they read a person’s dating profile, saw a photograph of her potential date, and predicted how much she would enjoy a date with him).
In all of our studies, participants were asked to make a numerical prediction, based on their best guess. After their initial guess, they received advice and had the chance to revise their prediction. For example, participants answered: “What is the probability that Tesla Motors will deliver more than 80,000 battery-powered electric vehicles (BEVs) to customers in the calendar year 2016?” by typing a percentage from 0 to 100%.
When participants received advice, it came in form of another prediction, which was labeled as either another person’s or an algorithm’s. We produced the numeric advice using simple math that combined multiple human judgments. Doing so allowed us to truthfully present the same advice as either “human” or “algorithmic.” We incentivized participants to revise their predictions — the closer their prediction was to the actual answer, the greater their chances of receiving a monetary bonus.
Then, we measured how much people changed their estimate, after receiving the advice. For each participant, we captured a percentage from 0% to 100% to reflect how much they changed their estimate from their initial guess. Specifically, 0% means they completely disregarded the advice and stuck to their original estimate, 50% means they changed their estimate halfway toward the advice, and 100% means they matched the advice completely.
To our surprise, we found that people relied more on the same advice when they thought it came from an algorithm than from other people. These results were consistent across our studies, regardless of the different kinds of numerical predictions. We found this algorithm appreciation especially interesting as we did not provide much information about the algorithm. We presented the algorithmic advice this way because algorithms regularly appear in daily life without a description (called ‘black box’ algorithms); most people aren’t privy to the inner workings of algorithms that predict things affecting them (like the weather or the economy).
We wondered whether our results were due to people’s increased familiarity with algorithms today. If so, age might account for people’s openness to algorithmic advice. Instead, we found that our participants’ age did not influence their willingness to rely on the algorithm. In our studies, older people used the algorithmic advice just as much as younger people. What did matter was how comfortable participants were with numbers, which we measured by asking them to take an 11-question numeracy test. The more numerate our participants (i.e., the more math questions they answered correctly on the 11-item test), the more they listened to the algorithmic advice.
Next, we wanted to test whether the idea that people won’t trust algorithms is still relevant today – and whether contemporary researchers would still predict that people would dislike algorithms. In an additional study, we invited 119 researchers who study human judgment to predict how much participants would listen to the advice when it came from a person vs. algorithm. We gave the researchers the same survey materials that our participants had seen for the matchmaker study. These researchers, consistent with what many companies have assumed, predicted that people would show aversion to algorithms and would trust human advice more–the opposite of our actual findings.
We were also curious about whether the expertise of the decision-maker might influence algorithmic appreciation. We recruited a separate sample of 70 national security professionals who work for the U.S. government. These professionals are experts at forecasting, because they make predictions on a regular basis. We asked them to predict different geopolitical and business events and had an additional sample of non-experts (301 online participants) do the same. As in our other studies, both groups made a prediction, received advice labeled as either human or algorithmic, and then were given the chance to revise their prediction to make a final estimate. They were informed that the more accurate their answers, the better their chances of winning a prize.
The non-experts acted like our earlier participants – they relied more on the same advice when they thought it came from an algorithm than a person for each of the forecasts. The experts, however, discounted both the advice from the algorithm and the advice from people. They seemed to trust their own expertise the most, and made minimal revisions to their original predictions.
We needed to wait about a year to score the accuracy of the predictions, based on whether the event had actually occurred or not. We found that the experts and non-experts made similarly accurate predictions when they received advice from people, because they equally discounted that advice. But when they received advice from an algorithm, the experts made less accurate predictions than the non-experts, because the experts were unwilling to listen to the algorithmic advice. In other words, while our non-expert participants trusted algorithmic advice, the national security experts didn’t, and it cost them in terms of accuracy. It seemed that their expertise made them especially confident in their forecasting, leading them to more or less ignore the algorithm’s judgment.
Another study we ran corroborates this potential explanation. We tested whether faith in one’s own knowledge might prevent people from appreciating algorithms. When participants had to choose between relying on an algorithm or relying on advice from another person, we again found that people preferred the algorithm. However, when they had to choose whether to rely on their own judgment or the advice of an algorithm, the algorithm’s popularity declined. Although people are comfortable acknowledging the strengths of algorithmic over human judgment, their trust in algorithms seems to decrease when they compare it directly to their own judgment. In other words, people seem to appreciate algorithms more when they’re choosing between an algorithm’s judgment and someone else’s than when they’re choosing between an algorithm’s judgment and their own.
Other researchers have found that the context of the decision-making matters for how people respond to algorithms. For instance, one paper found that when people see an algorithm make a mistake, they are less likely to trust it, which hurts their accuracy. Other researchers found that people prefer to get joke recommendations from a close friend over an algorithm, even though the algorithm does a better job. Another paper found that people are less likely to trust advice from an algorithm when it comes to moral decisions about self-driving cars and medicine.
Our studies suggest that people are often comfortable accepting guidance from algorithms, and sometimes even trust them more than other people. That is not to say that customers don’t sometimes appreciate “the human touch” behind products and services; but it does suggest that it may be not be necessary to invest in emphasizing the human element of a process wholly or partially driven by algorithms. In fact, the more elaborate the artifice, the more customers may feel deceived when learning they were actually guided by an algorithm. Google Duplex, which calls businesses to schedule appointments and make reservations, generated instant backlash because it sounded “too” human and people felt deceived.
Transparency may pay off. Maybe companies that present themselves as primarily driven by algorithms, like Netflix and Pandora, have the right idea.
0 comments:
Post a Comment