Patent attributes
Systems and methods provide for capturing a plurality of segments of an audio stream and, for each segment of the plurality of segments of the audio stream: performing feature extraction on an audio signal of the segment using a feature extraction machine learning model that analyzes the audio signal to generate a feature vector for the segment and generating a prediction value for the segment for whether there is music in the segment using the extracted feature vector and a music detector machine learning model. The systems and methods further provide for generating a probability value that there is music in the audio stream based on the prediction value for each of the plurality of segments and causing the audio stream to be identified based on determining that the probability value that there is music in the audio stream meets a predetermined threshold.