In one embodiment, a device forms a neural network envelope cell that comprises a plurality of convolution-based filters in series or parallel. The device constructs a convolutional neural network by stacking copies of the envelope cell in series. The device trains, using a training dataset of images, the convolutional neural network to perform image classification by iteratively collecting variance metrics for each filter in each envelope cell, pruning filters with low variance metrics from the convolutional neural network, and appending a new copy of the envelope cell into the convolutional neural network.