Commercial data processors are available that include a capability of extending their instruction set for a specified application, i.e. of introducing customized functional units in the interest of enhanced processing performance. For such processors there is a need for automatically forming the extensions from high-level application code. A technique is described for selecting maximal-speedup convex subgraphs of the application dataflow graph under micro-architectural constraints.