Patent 11282524 was granted and assigned to Capital One on March, 2022 by the United States Patent and Trademark Office.
A device may receive a set of audio data files corresponding to a set of calls, wherein the set of audio data files includes digital representations of one or more segments of respective calls of the set of calls, and wherein the set of calls includes audio data relating to a particular industry. The device may receive a set of transcripts corresponding to the set of audio data files. The device may determine a plurality of text-audio pairs within the set of calls, wherein a text-audio pair, of the plurality of text-audio pairs, comprises: a digital representation of a segment a call of the set of calls, and a corresponding excerpt of text from the set of transcripts. The device may train, using a machine learning process, an industry-specific text-to-speech model, tailored for the particular industry, based on the plurality of text-audio pairs.