Patent attributes
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying advertisements in television broadcasts. In an aspect, a method includes identifying caption data that has been received at least a threshold number of times and received over at least a threshold number of channels as repeated caption data, selecting video frames indexed at times and channels that that correspond times and channels of the repeated caption data, providing the selected video frames to a video processing system that identifies advertising objects in the video frame data and receiving from the video processing system the advertising objects, and for each advertising object, associating the advertising object with the repeated caption data having times and channel indices that match the time and channel index of the video frame from which the advertising object was identified.