Patent attributes
Some embodiments provide a method for training parameters of a network. The method receives a network with layers of nodes. Each node of a set of the layers computes an output value based on a set of input values and a set of trained weight values. A first layer of the network includes a first number of filters. The method replaces the first layer with a second layer having a second number of filters that is less than the first number and a third layer, following the second layer, having the first number of filters. Each weight value in the filters of the second and third layers is restricted to a set of allowed quantized weight values. A total number of weight values in the filters of the second and third layers is less than a total number of weight values in the filters of the first layer.