Patent attributes
A system that automatically configures the behavior of the display devices of a video conference endpoint. The controller may detect, at a microphone array having a predetermined physical relationship with respect to a camera, audio emitted from one or more loudspeakers, each loudspeaker having a predetermined physical relationship with respect to at least one of one or more display devices in a conference room. The controller may then generate data representing a spatial relationship between the one or more display devices and the camera based on the detected audio. Finally, the controller may assign video sources received by the endpoint to each of the one or more display devices based on the data representing the spatial relationship and the content of each received video source, and may also assign outputs from multiple video cameras to an outgoing video stream based on the on the data representing the spatial relationship.