US Patent 11669473 Allreduce enhanced direct memory access functionality

Systems, apparatuses, and methods for performing an allreduce operation on an enhanced direct memory access (DMA) engine are disclosed. A system implements a machine learning application which includes a first kernel and a second kernel. The first kernel corresponds to a first portion of a machine learning model while the second kernel corresponds to a second portion of the machine learning model. The first kernel is invoked on a plurality of compute units and the second kernel is converted into commands executable by an enhanced DMA engine to perform a collective communication operation. The first kernel is executed on the plurality of compute units in parallel with the enhanced DMA engine executing the commands for performing the collective communication operation. As a result, the allreduce operation may be executed in parallel on the enhanced DMA engine to the compute units.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11669473 Allreduce enhanced direct memory access functionality

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11669473 Allreduce enhanced direct memory access functionality