Patent attributes
Various embodiments of the present invention utilize systems, methods, and computer program products that perform predictive data analysis operations by using an agent machine learning model to determine an optimal clinical intervention based at least in part on the current clinical state and an inferred reinforcement learning policy that is determined based at least in part on a familiarity-adjusted reward function, where the familiarity-adjusted reward function is generated by an environment machine learning framework based at least in part on one or more next state predictions for one or more pruned action-state combinations based at least in part on a historical clinical outcome database, and the one or more pruned action-state combinations are determined based at least in part on one or more pruned clinical actions that are selected from a plurality of candidate clinical actions based at least in part on one or more action pruning criteria.