Patent attributes
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for estimating a 3-D pose of an object of interest from image and point cloud data. In one aspect, a method includes obtaining an image of an environment; obtaining a point cloud of a three-dimensional region of the environment; generating a fused representation of the image and the point cloud; and processing the fused representation using a pose estimation neural network and in accordance with current values of a plurality of pose estimation network parameters to generate a pose estimation network output that specifies, for each of multiple keypoints, a respective estimated position in the three-dimensional region of the environment.