Patent attributes
Systems and methods for improving machine learning systems used to model topics on a plurality of calls are described herein. In an embodiment, a server computer receives plurality of digitally stored call transcripts that have been prepared from digitally recorded voice calls. The server computer uses a topic model of an artificial intelligence machine learning system, the topic model modeling words of a call as a function of one or more word distributions for each topic of a plurality of topics, to generate an output of the topic model which identifies the plurality of topics represented in the plurality of call transcripts. The server computer computes, for a particular topic of the plurality of topics a first value representing a vocabulary of the particular topic and a second value representing a consistency of the particular topic in two more call transcripts of the plurality of call transcripts which include the particular topic. Based, at least in part, on one or more of the first value or the second value, the server computer determines that the particular topic meets a particular criterion and, in response, updates the output of the topic model to remove the particular topic or distinguish the particular topic from other topics of the plurality of topics which do not meet the particular criterion.