Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for test cycle optimization using contextual association mapping. In one aspect, a method includes obtaining an artifact that includes a collection of reference items, where each reference item includes a sequence of words, generating candidate tags from each of the reference items based on the sequences of words in the reference items, selecting a subset of the candidate tags as context tags based on an amount that the candidate tags appear in the reference items, obtaining a sample item that includes a sequence of words, identifying a subset of the context tags in the sequence of words in the sample item, and classifying a subset of the reference items as contextually similar to the sample item based the context tags that were identified.