Patent attributes
Systems, methods, and other embodiments described herein relate to generating depth estimates of an environment depicted in a monocular image. In one embodiment, a method includes identifying semantic features in the monocular image according to a semantic model. The method includes injecting the semantic features into a depth model using pixel-adaptive convolutions. The method includes generating a depth map from the monocular image using the depth model that is guided by the semantic features. The pixel-adaptive convolutions are integrated into a decoder of the depth model. The method includes providing the depth map as the depth estimates for the monocular image.