Patent 11521074 was granted and assigned to Carnegie Mellon University on December, 2022 by the United States Patent and Trademark Office.
To improve the throughput and energy efficiency of Deep Neural Networks (DNNs) on customized hardware, lightweight neural networks constrain the weights of DNNs to be a limited combination of powers of 2. In such networks, the multiply-accumulate operation can be replaced with a single shift operation, or two shifts and an add operation. To provide even more design flexibility, the k for each convolutional filter can be optimally chosen instead of being fixed for every filter. The present invention formulates the selection of k to be differentiable and describes model training for determining k-based weights on a per-filter basis. The present invention can achieve higher speeds as compared to lightweight NNs with only minimal accuracy degradation, while also achieving higher computational energy efficiency for ASIC implementation.