US Patent 11972349 Flexible compute array utilization in a tensor processor

In one embodiment, a method for machine learning acceleration includes receiving instructions to perform convolution on an input tensor using a filter tensor, determining that the size of a first dimension of the input tensor is less than a processing capacity of each of multiple subarrays of computation units in a tensor processor, selecting a second dimension of the input tensor along which to perform the convolution, selecting, based on the second dimension, one or more dimensions of the filter tensor, generating (1) first instructions for reading, using vector read operations, activation elements in the input tensor organized such that elements with different values in the second dimension are stored contiguously in memory, and (2) second instructions for reading weights of the filter tensor along the selected one or more dimensions, and using the first and second instructions to provide the activation elements and the weights to the subarrays.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11972349 Flexible compute array utilization in a tensor processor

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11972349 Flexible compute array utilization in a tensor processor