Patent attributes
A computer-implemented method of processing target footage of a target human face includes training an encoder-decoder network comprising an encoder network, a first decoder network, and a second decoder network. The training includes training a first path through the encoder-decoder network including the encoder network and the first decoder network to reconstruct the target footage of the target human face, and training a second path through the encoder-decoder network including the encoder network and the second decoder network to process renders of a synthetic face model exhibiting a range of poses and expressions to determine parameter values for the synthetic face model corresponding to the range of poses and expressions. The method includes processing, using a trained network path comprising or trained using the encoder network and comprising the first decoder network, source data representing the synthetic face model exhibiting a source sequence of expressions, to generate output video data.