One embodiment provides a method that includes receiving adjusted labeled data based on emotional tone factors. Words are analyzed using a tone latent Dirichlet allocation (T-LDA) model that models tone intensity using the emotional tone factors and integrating the adjusted labeled data. Representative words are provided for each emotional tone factor based on using the T-LDA model. The representative words are obtained using the T-LDA model based on determining posterior probabilities and adjusting the posterior probabilities based on an auxiliary topic.