Patent attributes
A method performed by a computing system for directing a voice command to a function associated with a visual target includes receiving a set of time-variable sensor-based data streams, including an audio data stream and a targeting data stream. The targeting data stream is stored in a buffer as buffered targeting data. Presence of a spoken utterance is identified within the audio data stream and is associated with a temporal identifier corresponding in time to the set of sensor-based data streams. A voice command corresponding to the spoken utterance is identified. A visual targeting vector within the buffered targeting data and a visual target of that visual targeting vector is identified at a time corresponding to the temporal identifier. The voice command is directed to a function associated with the visual target to generate an output.