Patent attributes
In one aspect, an example method includes (i) estimating, using a skeletal detection model, a pose of an original actor for each of multiple frames of a video; (ii) obtaining, for each of a plurality of the estimated poses, a respective image of a replacement actor; (iii) obtaining replacement speech in the replacement actor's voice that corresponds to speech of the original actor in the video; (iv) generating, using the estimated poses, the images of the replacement actor, and the replacement speech, synthetic frames corresponding to the multiple frames of the video that depict the replacement actor in place of the original actor, with the synthetic frames including facial expressions for the replacement actor that temporally align with the replacement speech; and (iv) combining the synthetic frames and the replacement speech so as to obtain a synthetic video that replaces the original actor with the replacement actor.