Disclosed herein is an environmental scanning tool that generates a digital model representing the surroundings of a user of an extended reality head-mounted display device. The environment is imaged in both a depth map and in visible light for some select objects of interest. The selected objects exist within the digital model at higher fidelity and resolution than the remaining portions of the model in order to manage the storage size of the digital model. In some cases, the objects of interest are selected, or their higher fidelity scans are directed, by a remote user. The digital model further includes time stamped updates of the environment such that users can view a state of the environment according to various timestamps.