Techniques are described in which a decoder is configured to inverse quantize a first coefficient block and apply a first inverse transform to at least part of the inverse quantized first coefficient block to generate a second coefficient block. The first inverse transform is a non-separable transform. The decoder is further configured to apply a second inverse transform to the second coefficient block to generate a residual video block. The second inverse transform converts the second coefficient block from a frequency domain to a pixel domain. The decoder is further configured to form a decoded video block, wherein forming the decoded video block comprises summing the residual video block with one or more predictive blocks.