Embodiments are described for detecting optical discrepancies associated with image capture analyzing pixels in multiple images corresponding to common points of reference in a physical environment. In an embodiment, photometric error values are averaged over time to compute the mean error at each pixel. Once the estimate of the mean error has a sufficient number of updates above a specified value, the estimate is thresholded to provide a mask of any optical discrepancies occurring in the stereo pair of images. Applications include detecting optical discrepancies in images captured for use by a visual navigation system in guiding an autonomous vehicle (e.g., an unmanned aerial vehicle).