A universal deep probabilistic decision-making framework for dynamic process modeling and control, referred to as Deep Probabilistic Decision Machines (DPDM), is presented. A predictive model enables the generative simulations of likely future observation sequences for future or counterfactual conditions and action sequences given the process state. Then, the action policy controller, also referred to as decision-making controller, is optimized based on predictive simulations. The optimal action policy controller is designed to maximize the relevant key performance indicators (KPIs) relying on the predicted experiences of sensor and target observations for different actions over the near future.