Industry attributes
Other attributes
A deepfake is defined as an image, recording, or video that has been altered or manipulated to misrepresent someone as doing or saying something that did not actually occur. These types of media go beyond simple "Photoshops" and are a type of synthetic media generated by artificial intelligence and deep learning systems that manipulate media to create convincing hoaxes. Deepfake often describes both the technology and the resulting content. The word itself is a portmanteau of "deep learning" and "fake."
Often deepfakes are used to transform existing source content where one person is swapped for another, but they can also be used to create entirely original content where someone can be represented doing or saying something they did not.
Deepfakes are made using deep learning techniques. These techniques work to encode features, then reconstruct images from these encoded features. The most commonly used deep learning architecture for creating deepfakes are autoencoders, while another common way of making a deepfake uses a generative adversarial network, or GAN.
Regardless of method, creating a deepfake has become incredibly easy, with smartphone applications capable of creating near-real-time deepfakes with decent accuracy. For more advanced deepfakes, however, the requirement includes a CPU on a local computer, although the best reproductions tend to be developed using GPUs. More of these tools are also being offered through cloud computing methods, which can take longer than developing a deepfake on a local computer but can be less costly to the user, depending on the use case.
Using an autoencoder, producing a deepfake is not difficult. A user needs to transform a given face into smaller feature-based representations using the encoder, with more feature or information-rich representations referred to as a latent face, which will contain representations for features such as nose shape, skin tone, and eye color. The latent face is then transformed back into the image using a decoder, which then places the generated face of person A on the latent face of person B, thereby creating a deepfake. This process puts the autoencoder through various training phases, in which the different images of the latent face are shared from input images to best understand the differences and generate a more realistic deepfake.
Autoencoders tend to be lighter weight when it comes to computing resources required than GANs, and are often used in various "face swap" applications. For example, the FaceSwap app uses face alignment, Gauss-Newton optimization, and image-blending to swap the face of a person seen in the camera with another face of a person in a provided picture. The FaceSwap approach is based on two autoencoders with a further encoder trained to reconstruct training images of the source and target face. The autoencoder output is then blended with the rest of image using Poisson image editing, to create a deepfaked image.
Generative adversarial networks can be used to create various images, and some of those images can be deepfakes. Often GANs can be used to create artificial images for testing other AI networks, and they have been used to create deepfakes. The GAN is given a training set, and from this training set can generate new data with the same information, which is often what is considered the deepfake.
This allows a GAN to take a person in an existing image or video and replace them with another person's likeness. GANs use a technique in which a discriminator and a generator work together to differentiate a sample input from a generated input, with the generated input being the deepfake, which allows GANs to generate better deepfakes than autoencoders tend to generate but requires a lot more computing power.
With the wide availability of deepfake generation tools, it can be important for individuals to understand how to spot a deepfake. Various companies have encouraged communities to analyze and understand what can give a deepfake away, especially companies that rely on social media, which tends to be where the majority of deepfake videos surface. This research and campaigns for awareness have resulted in a variety of ways that are deemed to be capable of helping individuals uncover a deepfake:
- Unnatural face or environment: Deepfake images or sections of videos can have unnatural facial expressions or facial feature placement. The environment can be similarly unnatural or unrealistic, such as having lighting that is inconsistent.
- Unnatural behavior: Deepfake videos require continuity between images, which can be difficult to implement, and can lead to unnatural behaviors, such as uneven blinking or choppy motions.
- Image artifacts: Deepfake images can have weird artifacts, such as blurriness around the place where a face has been manipulated or placed on another body, and can have other inconsistencies.
- Audio: When audio is combined with a deepfake video, the lips may follow unexpected motions compared to what the audio suggests or leads the viewer to expect.
Deepfake videos have been used for funny videos and related comic purposes and have been used in major movies to keep deceased actors in a given role or to de-age an older actor. However, in 2019, researchers found that a staggering 96 percent of deepfake videos shared online were pornographic, with almost all of those videos—an estimated 99 percent—mapping the faces of celebrities onto porn stars.
The use of deepfakes for porn, especially revenge porn, has repeatedly made the news and attracted a lot of attention, but deepfakes have also been used in various other areas, such as a 2018 video in which Donald Trump gave a speech calling on Belgium to withdraw from the Paris Climate Agreement, a speech that was never given and a deepfake video with political and international ramifications.
Other concerns around deepfakes have included their use for generating fake evidence for criminal trials that could be used against people in court; manipulating the stock market through the use of faked footage of influential people making statements to influence the stock prices; making a false statement about a business to destabilize and degrade that business; and online bullying, with videos and manipulated media allowing bullies to generate content to humiliate individuals (revenge porn is already a part of this).
There are also potential positive use cases for the technology. For example, journalists and human rights groups have used the technology to hide the identities of at-risk individuals, such as in the 2020 HBO documentary Welcome to Chechnya, which used deepfake technology to hide the identity of Russian refugees whose lives were at risk while telling their stories. Another has been WITNESS, an organization focused on using media to defend human rights, and which has approached using the technology to protect people such as activists, to take on advocacy approaches, and for political satire.
Deepfake and its related technology have also been used for various rather innocuous applications. This has included its use in "face swap" applications, which have allowed people conversing through applications to swap faces with each other, or with the fake aging applications that allow users to see what they would look like when they are older. These humorous applications tend to be relatively harmless.