US Patent 11556450 Hybrid data-model parallelism for efficient deep learning

The embodiments herein describe hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11556450 Hybrid data-model parallelism for efficient deep learning

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11556450 Hybrid data-model parallelism for efficient deep learning