Patent attributes
Embodiments of the present invention are directed to a computer-implemented method for action localization. A non-limiting example of the computer-implemented method includes receiving, by a processor, a video and segmenting, by the processor, the video into a set of video segments. The computer-implemented method classifies, by the processor, each video segment into a class and calculates, by the processor, importance scores for each video segment of a class within the set of video segments. The computer-implemented method determines, by the processor, a winning video segment of the class within the set of video segments based on the importance scores for each video segment within the class, stores, by the processor, the winning video segment from the set of video segments, and removes the winning video segment from the set of video segments.