Patent attributes
A machine learning model is trained and used to perform a computer vision task such as semantic segmentation or normal direction prediction. The model uses a current image of a physical setting and input generated from three dimensional (3D) anchor points that store information determined from prior assessments of the physical setting. The 3D anchor points store previously-determined computer vision task information for the physical setting for particular 3D points locations in a 3D worlds space, e.g., an x, y, z coordinate system that is independent of image capture device pose. For example, 3D anchor points may store previously-determined semantic labels or normal directions for 3D points identified by simultaneous localization and mapping (SLAM) processes. The 3D anchor points are stored and used to generate input for the machine model as the model continues to reason about future images of the physical setting.