A vision system of a vehicle includes a camera disposed at the vehicle and having a field of view exterior of the vehicle. The camera includes an imager and a processor that is operable to process image data captured by the imager. A video display screen is operable to display video images derived from image data captured by the camera. The processor generates a graphic overlay for display with the video images at the video display screen. The processor calibrates the graphic overlay by adapting the view orientation and position of the displayed video images utilizing a spatial transform engine to adapt the view orientation and position to a corrected position and orientation.