Providing a predictive model for a target language by determining an instance weight for a labeled source language textual unit according to a set of unlabeled target language textual units, scaling, by the one or more computer processors, an error between a predicted label for the source language textual unit and a ground-truth label for the source language textual unit according to the instance weight, updating, by the one or more computer processors, network parameters of a predictive neural network model for the target language according to the error, and providing, by the one or more computer processors, the predictive neural network model for the target language to a user.