Patent attributes
An input sequence of unstructured speech recognition text is transformed into output structured document text. A probabilistic word substitution model is provided which establishes association probabilities indicative of target structured document text correlating with source unstructured speech recognition text. The input sequence of unstructured speech recognition text is looked up in the word substitution model to determine likelihoods of the represented structured document text corresponding to the text in the input sequence. Then, a most likely sequence of structured document text is generated as an output.