A method and system assists users to generate training sets of labeled data items for machine learning processes. The method and system receives data sets of unlabeled data items from users. The method and system presents data items from the data set for labeling by the users. The method and system analyzes data items that have been labeled by the user and selects future data items to be presented to the user based on analysis of the labeled data items.