Patent attributes
Some embodiments provide a method for a neural network inference circuit that executes a neural network including multiple computation nodes at multiple layers. Each computation node of a set of the computation nodes includes a dot product of input values and weight values. The method reads a set of encoded weight data for a set of weight values from a memory of the neural network inference circuit. The method decodes the encoded weight data to generate decoded weight data for the set of weight values. The method stores the decoded weight data in a buffer. The method uses the decoded weight data to execute a set of computation nodes. Each computation node of the set of computation nodes includes a dot product between the set of weight values and a different set of input values.