An apparatus for video decoding that includes processing circuitry. The processing circuitry can be configured to receive a coding block having a width of W pixels and a height of H pixels, and partition the coding block into sub processing units (SPUs) each having a width of a lesser of W or K pixels and a height of a lesser of H or K pixels, where K is a dimension of a virtual pipeline data unit (VPDU) having an area of K×K pixels. Each SPU is partitioned into transform units with each transform unit having a maximum allowable transform unit size of M pixels.