Patent attributes
Systems and methods for answering region specific questions are provided. A method includes obtaining a regional scene question including an attribute query and a spatial region of interest for a training scene depicting a surrounding environment of a vehicle. The method includes obtaining a universal embedding for the training scene and an attribute embedding for the attribute query of the scene question. The universal embedding can identify sensory data corresponding to the training scene that can be used to answer questions concerning a number of different attributes in the training scene. The attribute embedding can identify aspects of an attribute that can be used to answer questions specific to the attribute. The method includes determining an answer embedding based on the universal embedding and the attribute embedding and determining a regional scene answer to the regional scene question based on the spatial region of interest and the answer embedding.