A system and method for generation of a facial model. The method includes analyzing, via machine vision, a plurality of multimedia content elements to identify a plurality of facial images shown in the plurality of multimedia content elements; clustering the identified facial images into at least one cluster, wherein the clustering is based on metadata associated with each of the plurality of facial images; and selecting, from among the at least one cluster, a representative cluster representing a face, wherein the facial model is the selected representative cluster.