Patent attributes
The present disclosure generally relates to methods and systems for controlling an autonomous vehicle. The vehicle may collect scenario information from one or more sensors mounted on a vehicle. The vehicle may determine a high-level option for a fixed time horizon based on the scenario information. The vehicle may apply a prediction algorithm to the high-level option to mask undesired low-level behaviors for completing the high-level option where a collision is predicted to occur. The vehicle may evaluate a restricted subspace of low-level behaviors using a reinforcement learning system. The vehicle may control the vehicle to perform the high-level option by executing a low-level behavior selected from the restricted subspace. The vehicle may adjust the reinforcement learning system by evaluating a metric of the executed low-level behavior.