Patent attributes
Audio information defining audio content may be accessed. The audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. Feature vectors of the audio segments may be determined. The feature vectors may be processed through a classifier. The classifier may output scores on whether the audio segments contain voice. One or more of the audio segments may be identified as containing voice based on the scores and a two-step hysteresis thresholding. Storage of the identification of the one or more of the audio segments as containing voice in one or more storage media may be effectuated.