Patent attributes
A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.