Patent attributes
A facility for visual search on a mobile device takes advantage of multi-modal and multi-touch input on the mobile device. By extracting lexical entities from a spoken search query and matching the lexical entities to image tags, the facility provides candidate images for each entity. Selected ones of the candidate images are used to construct a composite visual query image on a query canvas. The relative size and position of the selected candidate images in the composite visual query image, which need not be an existing image, contribute to a definition of a context of the composite visual query image being submitted for context-aware visual search.