Patent attributes
During each of a plurality of iterations, a policy of a controller is updated and at least part of a process is controlled using the updated policy. The updated policy is associated with a performance level of the controller. For each iteration, the updated policy is determined using the associations generated during one or more previous iterations between the policies and the corresponding performance levels of the controller in controlling the at least part of the process, such that the updated policy is optimized to have a highest likelihood of producing a positive change in the performance level of the controller in controlling the at least part of the process rather than optimized to have a highest likelihood of producing a largest positive magnitude of change in the performance level of the controller in controlling the at least part of the process relative to the previous iteration.