Patent attributes
An example system includes prediction workers, training workers, and a parameter server. The prediction workers store a local copy of a machine-learned model and run the mode exclusively in serving mode. The training workers store a local copy of a machine-learned model and a local snapshot and run the local copy exclusively in training mode and compare the local model or state to the snapshot after training to send delta updates to the parameter server after training. The parameter server aggregates received delta updates into a master copy of the model, sends the aggregated updates back to training workers and provides two types of updates; a real-time update based on a comparison of the master model with a local snapshot, and a full update. The real-time update occurs at least an order of magnitude more frequently than the full update and includes a subset of the weights in the model.