Described herein are systems and methods for inverse reinforcement learning to leverage the benefits of model-based optimization method and model-free learning method. Embodiments of a framework combining human behavior model with model predictive control are presented. The framework takes advantage of feature identification capability of a neural network to determine the reward function of model predictive control. Furthermore, embodiments of the present approach are implemented to solve the practical autonomous driving longitudinal control problem with simultaneous preference on safe execution and passenger comfort.