US Patent 10896045 Architecture for dense operations in machine learning inference engine

A processing unit of an inference engine for machine learning (ML) includes a first, a second, and a third register, and a matrix multiplication block. The first register receives a first stream of data associated with a first matrix data that is read only once. The second register receives a second stream of data associated with a second matrix data that is read only once. The matrix multiplication block performs a multiplication operation based on data from the first register and the second register resulting in an output matrix. A row associated with the first matrix is maintained while rows associated with the second matrix is fed to the matrix multiplication block to perform a multiplication operation. The process is repeated for each row of the first matrix. The third register receives the output matrix from the matrix multiplication block and stores the output matrix.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10896045 Architecture for dense operations in machine learning inference engine

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10896045 Architecture for dense operations in machine learning inference engine