The present invention relates to methods for searching for two-dimensional or three-dimensional objects. More particularly, the present invention relates to searching for two-dimensional or three-dimensional objects in a collection by using a multi-modal query of image and/or tag data. Aspects and/or embodiments seek to provide a method of searching for digital objects using any combination of images, three-dimensional shapes and text by embedding the vector representations for these multiple modes in the same space. Aspects and/or embodiments can be easily extensible to any other type of modality, making it more general.