Patent attributes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing topological scheduling on a machine-learning accelerator having an array of tiles. One of the methods includes performing, at each time step of a plurality of time steps corresponding respectively to columns within each of a plurality of wide columns of the tile array, operations comprising: performing respective multiplications using tiles in a respective tile column for the time step, computing a respective output result for each respective tile column for the time step including computing a sum of results of the multiplications for the tile column, and storing the respective output result for the tile column in a particular output RAM having a location within the same tile column and on a row from which the output result will be read by a subsequent layer of the model.