A Blog by Jonathan Low

 

Oct 28, 2023

How Data "Poison Pills" Disable AI That Scrapes Content Without Consent

This new tool can help content creators protect copyrighted or any other material from the unpaid use in training AI models by tech companies. 

The goal is not to stop the development of AI but to force tech to pay for the content they use rather than retaining historically unprecedented profits for themselves. JL

Benj Edwards reports in ars technica:

The open source "poison pill" tool alters images in ways invisible to the human eye that can corrupt an AI model's training process. Many image synthesis models, with notable exceptions of those from Adobe and Getty Images, largely use data sets of images scraped from the web without artist permission, which includes copyrighted material. The goal is to help artists and publishers protect their work from being used to train generative AI image synthesis models. "The point of this tool is to balance the playing field between model trainers and content creators,"

On Friday, a team of researchers at the University of Chicago released a research paper outlining "Nightshade," a data poisoning technique aimed at disrupting the training process for AI models, reports MIT Technology Review and VentureBeat. The goal is to help visual artists and publishers protect their work from being used to train generative AI image synthesis models, such as Midjourney, DALL-E 3, and Stable Diffusion.

 

The open source "poison pill" tool (as the University of Chicago's press department calls it) alters images in ways invisible to the human eye that can corrupt an AI model's training process. Many image synthesis models, with notable exceptions of those from Adobe and Getty Images, largely use data sets of images scraped from the web without artist permission, which includes copyrighted material. (OpenAI licenses some of its DALL-E training images from Shutterstock.) AI researchers' reliance on commandeered data scraped from the web, which is seen as ethically fraught by many, has also been key to the recent explosion in generative AI capability.

 

It took an entire Internet of images with annotations (through captions, alt text, and metadata) created by millions of people to create a data set with enough variety to create Stable Diffusion, for example. It would be impractical to hire people to annotate hundreds of millions of images from the standpoint of both cost and time. Those with access to existing large image databases (such as Getty and Shutterstock) are at an advantage when using licensed training data.

 

Along those lines, some research institutions, like the University of California Berkeley Library, have argued for preserving data scraping as fair use in AI training for research and education purposes. The practice has not been definitively ruled on by US courts yet, and regulators are currently seeking comment for potential legislation that might affect it one way or the other. But as the Nightshade team sees it, research use and commercial use are two entirely different things, and they hope their technology can force AI training companies to license image data sets, respect crawler restrictions, and conform to opt-out requests

"The point of this tool is to balance the playing field between model trainers and content creators," co-author and University of Chicago professor Ben Y. Zhao said in a statement. "Right now model trainers have 100 percent of the power. The only tools that can slow down crawlers are opt-out lists and do-not-crawl directives, all of which are optional and rely on the conscience of AI companies, and of course none of it is verifiable or enforceable and companies can say one thing and do another with impunity. This tool would be the first to allow content owners to push back in a meaningful way against unauthorized model training."

Shawn Shan, Wenxin Ding, Josephine Passananti, Haitao Zheng, and Zhao developed Nightshade as part of the Department of Computer Science at the University of Chicago. The new tool builds upon the team's prior work with Glaze, another tool designed to alter digital artwork in a manner that confuses AI. While Glaze is oriented toward obfuscating the style of the artwork, Nightshade goes a step further by corrupting the training data. Essentially, it tricks AI models into misidentifying objects within the images.

 

For example, in tests, researchers used the tool to alter images of dogs in a way that led an AI model to generate a cat when prompted to produce a dog. To do this, Nightshade takes an image of the intended concept (e.g., an actual image of a "dog") and subtly modifies the image so that it retains its original appearance but is influenced in latent (encoded) space by an entirely different concept (e.g., "cat"). This way, to a human or simple automated check, the image and the text seem aligned. But in the model's latent space, the image has characteristics of both the original and the poison concept, which leads the model astray when trained on the data.

 

Researchers tested the tool using Stable Diffusion, an open source text-to-image generation model, and found that after the model ingested 50 poisoned images, it began generating dogs with distorted features, poisoning the entire concept of "dogs" within the model. After 100 samples, it produced cats instead of dogs, and by 300 samples, the cat images were near perfect. As VentureBeat notes, because of the way generative AI models cluster similar concepts into "embeddings," Nightshade was also able to trick the model into generating a cat when prompted with related words like "husky," "puppy," and "wolf."

Defending against Nightshade's data poisoning technique may pose a challenge for AI developers. The altered pixels are not easily detectable by the human eye and may even be difficult for software data scraping tools to identify. Any poisoned images that have already been used for training would need to be detected and removed, and the compromised AI models likely would have to be retrained.

 

While the University of Chicago researchers acknowledge that their tool could be used maliciously, they argue their main goal is to tip the power balance back toward artists. The Glaze project team elaborated on Nightshade’s goals via a thread on the social platform X, emphasizing the "power asymmetry between AI companies and content owners," labeling it "ridiculous." Additionally, Zhao wrote on his account, "Nightshade’s purpose is not to break models. It’s to disincentivize unauthorized data training and encourage legit licensed content for training. For models that obey opt outs and do not scrape, there is minimal or zero impact."

The development of technologies like Nightshade could ignite an arms race between researchers intent to protect human creative works from AI absorption and those who seek to feed their data-hungry models. Larger companies with more resources may be able to eventually work around Nightshade with countermeasures, but smaller firms and open source projects with lower budgets might be disproportionately affected.

 

Still, as far as some artists are concerned, no one should ever use their work to train AI models without their permission. As news of Nightshade broke on Monday, a few outspoken artists began publicly gloating about the potential for the new tool on social media. An illustrator named Katria Raden posted on X, "AI bros have heard the news and they're upset... what was it they said to us? Adapt or die? You can't stop technological progress?" which started a thread of similar opinions, with some pro-AI accounts pushing back.

It's worth noting that companies and researchers already scraped unpoisoned artwork long ago, so it's too late to affect what they already have downloaded to their machines (or to affect existing AI image generators). However, absorbing new artistic styles and photography of current events going forward could be affected by the Nightshade technique if it goes into widespread use—and if it isn't quickly defeated by a new technique.

0 comments:

Post a Comment