Patent attributes
The technology described herein is directed to a cross-domain training framework that iteratively trains a domain adaptive refinement agent to refine low quality real-world image acquisition data, e.g., depth maps, when accompanied by corresponding conditional data from other modalities, such as the underlying images or video from which the image acquisition data is computed. The cross-domain training framework includes a shared cross-domain encoder and two conditional decoder branch networks, e.g., a synthetic conditional depth prediction branch network and a real conditional depth prediction branch network. The shared cross-domain encoder converts synthetic and real-world image acquisition data into synthetic and real compact feature representations, respectively. The synthetic and real conditional decoder branch networks convert the respective synthetic and real compact feature representations back to synthetic and real image acquisition data (refined versions) conditioned on data from the other modalities. The cross-domain training framework iteratively trains the domain adaptive refinement agent.