Patent attributes
Described herein are techniques for generating insights, including summary descriptions, for an online meeting. During an online meeting, various pre-trained computer vision algorithms are used to identify regions of interest within one or more video streams depicting a meeting participant, and any video stream via which content is being shared. These detected regions of interest are further processed with additional pre-trained computer vision algorithms to detect non-verbal communications, including gestures made by meeting participants, facial expressions or emotions, and text or graphics shared as part of a content sharing feature. From these non-verbal communications, textual descriptions of the detected communications are derived. These textual descriptions of the detected communications are then provided as an input to a software-based meeting analyzer service, which is configured to generate various insights based on end-user queries, and automated requests.