Aspects described herein may allow an automated generation of an interactive multimedia content with annotations showing vehicle damage. In one method, a server may receive vehicle-specific identifying information of a vehicle. Image sensors may capture multimedia content showing aspects associated with exterior regions of the vehicle, and may send the multimedia content to the server. For each of the exterior regions of the vehicle, the server may determine, using a trained classification model, instances of damage. Furthermore, the server may generate an interactive multimedia content that shows images with annotations indicating instances of damage. The interactive multimedia content may be displayed via a user interface.