Patent attributes
Various arrangements are detailed herein related to managing video recording based on spoken commands. A system receives a video stream from a video camera and analyzes a field of view in the received video stream to determine a location for one or more identified or potential users. The system can beamform audio from microphones of a home assistant device based on the location of the one or more identified or potential users. The system adjusts an audio output based on the location of the one or more identified or potential users, receives a spoken command from the one or more identified or potential users, and outputs a response to the spoken command.