Patent attributes
Disclosed in some examples are methods, systems, and machine-readable mediums which determine jitter buffer delay by inputting jitter buffer and currently observed network status information to a machine learned model that is trained using a reinforcement learning (RL) method. The model maps these inputs to an action to compress, stretch, or hold the jitter buffer delay, which is used by a recipient computing device to optimize the jitter buffer delay. The model may be trained using a simulator that uses network traces of past real streaming sessions (e.g., communication sessions) of users. By training the model through reinforcement learning, the model learns to make better decisions through reinforcement in the form of reward signals that reflect the performance of each decision.