Patent 8700396 was granted and assigned to Google on April, 2014 by the United States Patent and Trademark Office.
This document generally describes computer technologies relating to generating speech data collection prompts, such as textual scripts and/or textual scenarios. Speech data collection prompts for a particular language can be generated based on a variety of factors, including the frequency with which linguistic elements (e.g., phonemes, syllables, words, phrases) in the particular language occur in one or more corpora of textual information associated with the particular language. Textual prompts can also and/or alternatively be generated based on statistics for previously recorded speech data.