Patent attributes
An important task in several wellness applications is detection of emotional valence from speech. Two types of features of speech signals are used to detect valence: acoustic features and text features. Acoustic features are derived from short frames of speech, while text features are derived from the text transcription. Present disclosure provides systems and methods that determine the effect of text on acoustic features. Acoustic features of speech segments carrying emotion words are to be treated differently from other segments that do not carry such words. Only specific speech segments of the input speech signal are considered based on a dictionary specific to a language to assess emotional valence. A model trained (or trained classifier) for specific language either by including the acoustic features of the emotion related words or by omitting it is used by the system for determining emotional valence in an input speech signal.