Patent attributes
Embodiments of the present invention provide a computer-implemented method for generating closed captions via optimal positioning and character specific output styles. The method includes receiving a video input. The method generates closed caption data from the video input based at least in part on extracting text data from an audio portion of the video input. For each given frame of the video input that has closed caption data associated with the given frame, one or more characters who are speaking in the given frame are identified via facial recognition and audio tone matching. A respective text style for each given character that uniquely identifies the given character from the one or more identified characters is obtained. Captioning in the respective text style of each of the one or more identified characters is generated. The generated captioning is then inserted into the given frame.