John Timmer reports in ars technica:
Adblocking has set off a software arms race, with publishers finding software solutions that keep ads appearing or entreat people using adblocking software to white-list them. Adblockers readily respond with modified software that targets these specific responses, triggering the publishers to try again. Outside of the economics, there's an computer science problem. Code is attempting to identify software present on a user's browser. How do you recognize when that's happening, and how can you intervene?
Advertising pays much of the budget for most online publishers, making the growth of adblockers an existential threat. As such, adblocking has set off a software-based arms race, with publishers finding software solutions that keep ads appearing or entreat people using adblocking software to white-list them. Adblockers readily respond with modified software that targets these specific responses, triggering the publishers to try again.
Some academics have recently stepped into the middle of this arms race, performing an analysis that allows them to identify the specific methods used by publishers to avoid having ads blocked. And the team has gone on to try a couple of different approaches, both of which modify a webpage's contents to keep the anti-adblocking software from having an effect.
Outside of the economics of it all, there's an interesting computer science problem here. The code on the webpage is attempting to identify software present on a user's browser. How do you recognize when that's happening, and how can you possibly intervene?
The adblocking wars
The approach the researchers took involved following code execution as a browser loaded and displayed the page. This was done with a modified version of Google's V8 JavaScript engine, one that allowed them to extract information about the downloaded code that was being processed and executed as the webpage loaded. By doing this with and without an ad blocker installed, they were able to identify differences in the code that was executed when ads were displayed or blocked.
As they note, typical anti-adblocking code might wait for the page to load and then check on the size of an element that's meant to contain an ad. If the ad isn't loaded, this area will never get defined, and its size will end up either being undefined or zero. This allows the code to perform some other action, like putting up an alternative ad or displaying a dialog to ask for the adblocking software to be disabled.
By following code traces, the authors could look for conditional tests—things like "is the size of this element 0?"—followed by execution of different code depending on whether an adblocker is present. By examining the code at that location, they could determine which condition was being tested for.
On its own, this provided an indication of just how prevalent anti-adblocking software is. The authors claim to have found an anti-adblocking response on more than 30 percent of the Alexa Top-10,000 websites, but it's somewhat more complicated than that. In many cases, adblocking software was detected, but there was no visible response; the software simply logged the presence of the adblocker, often through Google analytics.
Setting the software loose on webpages that normally don't show ads indicated it didn't produce any false-positive identifications. And a test of more than 400 sites known to use anti-adblocking software showed that it was more than 85 percent accurate at identifying them.
The false negatives came about for a variety of reasons. One of these is simply that Javascript has a variety of mechanisms by which programmers can test for specific conditions, and the team didn't trigger their analysis on all of them. The second is just random variability; each page was loaded six times, three of them with and without adblocking. Random differences among these, like slower or faster loading of some page components, could obscure the tests for the presence of anti-adblockers. There was at least one approach that the software missed entirely: it loaded a warning message about adblocking, then tried to load an ad on top of it; if the more complex one was blocked, the warning showed.
Intervention
WIth that success in hand, the authors decided to enter the arms race on the side of the adblockers. Since they knew what condition was being tested to determine whether an adblocker was being used, they could intervene in the page's JavaScript in a way that forced it to execute the adblocker-free branch of the code. This is relatively simple to do on the code side by simply rewriting the JavaScript so all the relevant branches do the same thing. Rewriting, however, required the installation of specially modified proxy software on the same computer and redirecting all the browser's requests so they went through this software.
This approach had a success rate of more than 80 percent on the websites it was tested with. And, despite the potentially significant mangling of the underlying code, only one site showed a visual defect.
An alternative approach they tried was somewhat more precise. Since they could identify the condition that was being tested for, they could modify the variables used by the site so that the condition would always evaluate as if an adblocker was not present. This only requires a browser extension. And, in the
15 websites it was tested on, it worked every time.
Motivation?
The authors are very upfront about their motivation for this work: "We want to develop a comprehensive understanding of anti-adblockers, with the ultimate aim of enabling adblockers to be resistant against anti-adblockers." They cite user privacy and security as the reason for choosing a side in the arms race, but it's not clear that their approach makes much sense in this regard. Running everything through a modified proxy or manipulating page-wide variables would seem to create a whole host of privacy and security risks on their own. In addition, it's not clear how blocking the mere logging of the existence of an adblocker, which their software would do, helps anyone.
And they admit that, as soon as publishers are aware of the methods they use to test for anti-adblocking software, workarounds will be possible. This could be as simple as finding a means of searching for an adblocker that won't be picked up by the researchers' approach. Or it could involve intermingling the code for the adblocking test with code that's essential for the page to work. Or it could involve re-using the variable that's manipulated by the researchers' software. Any of these, and presumably other approaches, would all work.
Finally, the researchers seem to be actively avoiding considering the consequences. Part of their introduction states flatly that "Adblocking results in billions of dollars' worth of lost advertising revenue for online publishers." And their own analysis confirms that the majority of the sites running anti-adblocking software are producing news. If they're aware that the success of their goals will involve crippling a lot of news sources, it's not apparent from this paper.
0 comments:
Post a Comment