Patent attributes
A computer trains a neural network model. (A) Observation vectors are randomly selected from a plurality of observation vectors. (B) A forward and backward propagation of a neural network is executed to compute a gradient vector and a weight vector. (C) A search direction vector is computed. (D) A step size value is computed. (E) An updated weight vector is computed. (F) Based on a predefined progress check frequency value, second observation vectors are randomly selected, a progress check objective function value is computed given the weight vector, the step size value, the search direction vector, and the second observation vectors, and based on an accuracy test, the mini-batch size value is updated. (G) (A) to (F) are repeated until a convergence parameter value indicates training of the neural network is complete. The weight vector for a next iteration is the computed updated weight vector.