Various implementations disclosed herein include devices, systems, and methods that obtain a three-dimensional (3D) representation of a physical environment that was generated based on depth data and light intensity image data, generate a 3D bounding box corresponding to an object in the physical environment based on the 3D representation, classify the object based on the 3D bounding box and the 3D semantic data, and display a measurement of the object, where the measurement of the object is determined using one of a plurality of class-specific neural networks selected based on the classifying of the object.