Kyle McDonald reports in Medium:
It can be difficult for a GAN to manage long-distance dependencies in images. While paired accessories like earrings usually match in the dataset, they don’t in the generated images. Eyes tend to point in the same direction and they are usually the same color, but the generated images are crosseyed and heterochromatic. Asymmetry is visible in ears being at very mismatched heights or sizes. GAN will stretch or shrink each tooth in unusual ways.Hair styles have a lot of variability, but also a lot of detail, making it one of the most difficult things for GAN to capture.
In 2014 machine learning researcher Ian Goodfellow introduced the idea of generative adversarial networks or GANs. “Generative” because they output things like images rather than predictions about input (like “hotdog or not”); “adversarial networks” because they use two neural networks competing with each other in a “cat-and-mouse game”, like a cashier and a counterfeiter: one trying to fool the other into thinking it can generate real examples, the other trying to distinguish real from fake.The first GAN images were easy for humans to identify. Consider these faces from 2014.But the latest examples of GAN-generated faces, published in October 2017, are more difficult to identify.Here are some things you can look for when trying to recognize an image produced by a GAN. We’ll focus on faces because they are a common testing ground for researchers, and many of the artifacts most visible in faces also appear in other kinds of images.Straight hair looks like paint
It’s common for long hair to take this hyper-straight look where a small patch seems good, but a long strand looks like someone smudged a bunch of acrylic with a palette knife or a huge brush.Text is indecipherable
GANs trained on faces have a hard time capturing rare things in the background with lots of structure. Also, GANs are shown both original and mirrored versions of the training data, which means they have trouble modeling writing because it typically only appears in one orientation.Background is surreal
One reason the faces from a GAN look believable is because all the training data has been centered. This means that there is less variability for the GAN to model when it comes to, for example, the placement and rendering of eyes and ears. The background, on the other hand, can contain anything. This is too much for the GAN to model and it ends up replicating general background-like-textures rather than “real” background scenes.Asymmetry
It can be difficult for a GAN to manage long-distance dependencies in images. While paired accessories like earrings usually match in the dataset, they don’t in the generated images. Or: eyes tend to point in the same direction and they are usually the same color, but the generated images are very frequently crosseyed and heterochromatic. Asymmetry is also commonly visible in ears being at very mismatched heights or sizes.Weird teeth
GANs can assemble a general scene, but currently have difficulty with semi-regular repeating details like teeth. Sometimes a GAN will generate misaligned teeth, or it will stretch or shrink each tooth in unusual ways. Historically this problem has shown up in other domains like texture synthesis with images like bricks.Messy hair
This is one of the quickest ways to identify a GAN-generated image. Typically a GAN will bunch hair in clumps, create random wisps around the shoulders, and throw thick stray hairs on foreheads. Hair styles have a lot of variability, but also a lot of detail, making it one of the most difficult things for a GAN to capture. Things that aren’t hair can sometimes turn into hair-like textures, too.Non-stereotypical gender presentation
This GAN was trained on a subset of CelebA, which contains 200k images of 10k celebrity faces. In this dataset, I haven’t seen an example of someone with facial hair, earrings, and makeup; but the GAN regularly mixes different attributes from stereotypical gender presentations. More generally, I think this is because GANs don’t always learn the same categories or binaries that humans socially reinforce (in this case “male vs female”).Semi-regular noise
Some areas that are otherwise monochrome may exhibit semi-regular noise with horizontal or vertical banding. In the cases above, this is probably the network trying to imitate the texture of cloth. Older GANs have a much more prominent noise pattern that is usually described as checkerboard artifacts.Iridescent color bleed
Some areas with lighter solid colors have a multi-hued cast, including collars, necks, and eye whites (not shown).Examples of real images
Check out that clear background text, those matching earrings, those equally sized teeth, detailed hairstyles.
1 comments:
The Ice Temple introduces sliding mechanics and frozen platforms. Fireboy and Watergirl Game is immune to ice, while Watergirl slides faster on icy surfaces, adding a new level of difficulty to the puzzles.
Post a Comment