Patent attributes
A set of training images is obtained by analyzing text associated with various images to identify images likely demonstrating a visual attribute. Localization can be used to extract patches corresponding to these attributes, which can then have features or feature vectors determined to train, for example, a convolutional neural network. A query image can be received and analyzed using the trained network to determine a set of items whose images demonstrate visual similarity to the query image at least with respect to the attribute of interest. The similarity can be output from the network or determined using distances in attribute space. Content for at least a determined number of highest ranked, or most similar, items can then be provided in response to the query image.