Patent 9613640 was granted and assigned to Audyssey Laboratories on April, 2017 by the United States Patent and Trademark Office.
A speech/music discrimination method evaluates the standard deviation between envelope peaks, loudness ratio, and smoothed energy difference. The envelope is searched for peaks above a threshold. The standard deviations of the separations between peaks are calculated. Decreased standard deviation is indicative of speech, higher standard deviation is indicative of non-speech. The ratio between minimum and maximum loudness in recent input signal data frames is calculated. If this ratio corresponds to the dynamic range characteristic of speech, it is another indication that the input signal is speech content. Smoothed energies of the frames from the left and right input channels are computed and compared. Similar (e.g., highly correlated) left and right channel smoothed energies is indicative of speech. Dissimilar (e.g., un-correlated content) left and right channel smoothed energies is indicative of non-speech material. The results of the three tests are compared to make a speech/music decision.