Patent attributes
A computer system is provided that includes a processor configured to store a set of audio training data that includes a plurality of audio segments and metadata indicating a word or phrase associated with each audio segment. For a target training statement of a set of structured text data, the processor is configured to generate a concatenated audio signal that matches a word content of a target training statement by comparing the words or phrases of a plurality of text segments of the target training statement to respective words or phrases of audio segments of the stored set of audio training data, selecting a plurality of audio segments from the set of audio training data based on a match in the words or phrases between the plurality of text segments of the target training statement and the selected plurality of audio segments, and concatenating the selected plurality of audio segments.