Patent attributes
Identifying key frames of a video for use in training a machine learning model is provided. Object detection is performed to identify frames of a video including target classes of objects of interest. Feature extraction is performed on the identified frames to generate raw feature vectors. The feature vectors are compressed into lower dimension vectors. The compressed feature vectors are compressed into a plurality of clusters. The clustered compressed feature vectors are filtered to identify the key frames from each of the plurality of clusters. The key frames may be provided as a representative data set of the video.