Patent attributes
A system provides product annotation in a video to one or more users. The system receives a video from a user, where the video includes multiple video frames. The system extracts multiple key frames from the video and generates a visual representation of the key frame. The system compares the visual representation of the key frame with a plurality of product visual signatures, where each visual signature identifies a product. Based on the comparison of the visual representation of the key frame and a product visual signature, the system determines whether the key frame contains the product identified by the visual signature of the product. To generate the plurality of product visual signatures, the system collects multiple training images comprising multiple of expert product images obtained from an expert product repository, each of which is associated with multiple product images obtained from multiple web resources.