Patent attributes
A model-pair is selected to recognize spoken words in a speech signal generated from a speech, which includes an acoustic model and a language model. A degree of disjointedness between the acoustic model and the language model is computed relative to the speech by comparing a first recognition output produced from the acoustic model and a second recognition output produced from the language model. When the acoustic model incorrectly recognizes a portion of the speech signal as a first word and the language model correctly recognizes the portion of the speech signal as a second word, a textual representation of the second word is determined and associated with a set of sound descriptors to generate a training speech pattern. Using the training speech pattern, the acoustic model is trained to recognize the portion of the speech signal as the second word.