Patent attributes
Techniques and apparatus for generating dense natural language descriptions for video content are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to receive a source video comprising a plurality of frames, determine a plurality of regions for each of the plurality of frames, generate at least one region-sequence connecting the determined plurality of regions, apply a language model to the at least one region-sequence to generate description information comprising a description of at least a portion of content of the source video. Other embodiments are described and claimed.