In some cases, lower quality, large scale training data can be automatically generated by automatic labeling. The generated training data can be used to pre-train a machine learning model. For instance, the model can be a model for detection of verbal harassment. Parameters of the pre-trained model can be refined or updated using another one or more higher-quality sets of training data, with which the model can be subsequently trained.