Patent attributes
A system and method for associating motion data with utterance audio data for use with a speech processing system. A device, such as a wearable device, may be capable of capturing utterance audio data and sending it to a remote server for speech processing, for example for execution of a command represented in the utterance. The device may also capture motion data using motion sensors of the device. The motion data may correspond to gestures, such as head gestures, that may be interpreted by the speech processing system to determine and execute commands. The device may associate the motion data with the audio data so the remote server knows what motion data corresponds to what portion of audio data for purposes of interpreting and executing commands. Metadata sent with the audio data and/or motion data may include association data such as timestamps, session identifiers, message identifiers, etc.