Patent attributes
A method for audio processing includes receiving, in a computer, a recording of a teleconference among multiple participants over a network including an audio stream containing speech uttered by the participants and conference metadata for controlling a display on video screens viewed by the participants during the teleconference. The audio stream is processed by the computer to identify speech segments, in which one or more of the participants were speaking, interspersed with intervals of silence in the audio stream. The conference metadata are parsed so as to extract speaker identifications, which are indicative of the participants who spoke during successive periods of the teleconference. The teleconference is diarized by labeling the identified speech segments from the audio stream with the speaker identifications extracted from corresponding periods of the teleconference.