Patent attributes
Systems, apparatuses, and methods for implementing a tiling algorithm for a matrix math instruction set are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a plurality of matrix elements in a linear format, and the processor converts the plurality of matrix elements from the linear format to a tiling format. Each compute unit retrieves a plurality of matrix elements from the memory into the cache. Each compute unit includes a matrix operations unit which loads the plurality of matrix elements of corresponding tile(s) from the cache and performs a matrix operation on the plurality of matrix elements to generate a result in the tiling format. The system generates a classification of a first dataset based on results of the matrix operations.