Patent attributes
An online system trains a content selection model based on a selected subset of presented content items as well as a sampled set of content items. The content selection model is configured to receive a set of features characterizing a user-content item pair and output a likelihood that the user will interact with the content item. The sampled set of content items may include content items that were not selected for display based on their likelihoods in addition to those that were selected, and may represent a wider distribution of user-content item pairs than the selected subset. By incorporating the sampled set of content items as well as the selected subset of content items in the training process, the online system can reduce bias in the content selection process such that content items similar to the unselected subset can also be adequately represented.