A Blog by Jonathan Low

 

Feb 7, 2015

Data Science Is Linking Your Instagrams to Your Credit Card Purchases. And Lots More

We want more and more granularity in our data, which is kind of long way of saying we want more data in our data. But nevermind.

We crave detail. Because that means more accuracy, which for the merchant means less uncertainty and a higher return on investment. For consumers it ostensibly means more personalized and more attractive offers. Sometimes with discounts to match.

But then there's this other thing. Not privacy so much (gone, baby, gone...). More like access for all to any opportunity they want with absolutely, positively no discrimination.

Except that there's this inherent contradiction: more data means more information. About who you are, what you do, where you go, who you like, what you buy. And when you do so. Which means that certain characteristics are strongly correlated, sometimes even causally linked, with some of those features. Like being rich. Or being a woman. Sorry, but that's how this data rolls. All of which could mean that you, consumer statistic collection in human form, are being had. The question is whether you care enough to do something about it. And then, what would that be? JL

Andrew Flowers reports in 538:

Scientists can accurately connect 90 percent of people to their credit card transactions. That data is supposed to be anonymous, but it’s not really, and women and high-income people have less anonymity than others.
When I tweeted from a Knicks game at Madison Square Garden on Dec. 2, I had no idea that data scientists could use that information to find out I’d used my MasterCard to buy an overpriced $12 beer — as well as identify all my other credit card purchases.
But with as few as four publicly available geo-tagged data points, scientists can accurately connect 90 percent of people to their credit card transactions, according to research published in the journal Science on Friday. That data is supposed to be anonymous, but it’s not really, and women and high-income people have less anonymity than others.
The study used metadata from three months of credit card transactions made by 1.1 million people who shopped at 10,000 stores in an unnamed (for now?) wealthy country. This metadata had no names, no account numbers, nor any other information that would make it easy to identify someone. The only transaction data available was the day it took place, the rough location and — in a separate model — the amount spent.
The researchers were able to then take geo-tagged information — such as Instagram photos, tweets and Facebook posts — and use it to mine the “anonymous” credit card metadata. So, in my case, they could combine my tweet from M.S.G. with three other data points — maybe when I posted on Facebook from Whole Foods, the public library and the gym — to match my name to my user ID in the transactions.
The authors’ model found that women were 21 percent more likely than men to be “re-identified” from the transaction data. High-income people were about 75 percent more likely to be identified than those with lower incomes, and medium-income people were 17 percent more likely to be pinpointed. The authors scored individuals’ behavior based on how unique it was relative to others (see the chart below, Figure 4 from the paper).
Chart from Science article on credit card reidentification
What is unique about their behavior? Where they shop. The stores that women and high-income people frequent are more distinct, making them easier to distinguish.
For greater privacy, it may be better to shop at big-box stores such as, say, Target or Home Depot. Oh, wait, never mind.

0 comments:

Post a Comment