Patent attributes
Current interfaces for displaying information about items appearing in videos are obtrusive and counterintuitive. They also rely on annotations, or metadata tags, added by hand to the frames in the video, limiting their ability to display information about items in the videos. In contrast, examples of the systems disclosed here use neural networks to identify items appearing on- and off-screen in response to intuitive user voice queries, touchscreen taps, and/or cursor movements. These systems display information about the on- and off-screen items dynamically and unobtrusively to avoid disrupting the viewing experience.