Techniques are generally described for machine learning exampled-based annotation of image data. In some examples, a first machine learning model may receive a query image comprising a first depiction of an object-of-interest. In some examples, the first machine learning model may receive a target image representing a scene in which a second depiction of the object-of-interest is visually represented. In various examples, the first machine learning model may generate annotated output image data that identifies a location of the second depiction of the object-of-interest within the target image. In some examples, an object detection model may be trained based at least in part on the annotated output image data.