Computer-based systems and methods guide the learning of features in middle layers of a deep neural network. The guidance can be provided by aligning sets of nodes or entire layers in a network being trained with sets of nodes in a reference system. This guidance facilitates the trained network to more efficiently learn features learned by the reference system using fewer parameters and with faster training. The guidance also enables training of a new system with a deeper network, i.e., more layers, which tend to perform better than shallow networks. Also, with fewer parameters, the new network has fewer tendencies to overfit the training data.