Patent attributes
Systems, methods, and devices for training a Natural Language Understanding (NLU) component of a system using spoken utterances of individuals are described. A server sends a device, such as a speech-controlled device, a signal that causes the device to output audio soliciting content regarding how a user would speak a particular command for execution by a particular application. The device captures spoken audio and sends it to the server. The server performs speech processing on received audio data to parse the audio data into multiple portions. The server then associates a first portion of the audio data with a command indicator and a second portion of the audio data with a content indicator. The associated data is then used to update how the NLU component determines how utterances triggering the command are spoken.