Patent attributes
Mitigating mistranscriptions resolves errors in a transcription of the audio portion of a video based on a semantic matching with contextualized data electronically garnered from one or more sources other than the audio portion of the video. A mistranscription is identified using a pretrained word embedding model that maps words to an embedding space derived from the contextualizing data. A similarity value for each vocabulary word of a multi-word vocabulary of the pretrained word embedding model is determined in relation to the mistranscription. Candidate words are selected based on the similarity values, each indicating a closeness of a corresponding vocabulary word to the mistranscription. The textual rendering is modified by replacing the mistranscription with a candidate word that, based on average semantic similarity values, is more similar to the mistranscription than is each other candidate word.