Systems and methods for implementing deterministic simulation for autonomous vehicle testing can include an autonomy bookkeeper system configured to generate data logs that include inputs and outputs for each of a first plurality of tasks associated with an autonomy stack. The data logs can be generated upon detection of events such as failed implementation of an autonomy stack. A simulation conductor system can be configured to access the data logs as part of implementing offline testing of an autonomy testing scenario including a second plurality of tasks. A task controller within the simulation conductor system can schedule the second plurality of tasks into a task order determined at least in part from the first plurality of tasks (e.g., based on bookmarks stored in the data logs obtained during implementation of the first plurality of tasks). The flow of inputs to and outputs from the second plurality of tasks can be based at least in part on the task order.