Systems, methods, devices, instructions, and other examples are described for natural language processing. One example includes accessing natural language processing general encoder data, where the encoder data is generated from a general-domain dataset that is not domain specific. A domain specific dataset is accessed and filtered encoder data using a subset of the encoder data is generated. The filtered encoder data is trained using the domain specific dataset to generate distilled encoder data, and tuning values for the distilled encoder data are generated to configure task outputs associated with the domain specific dataset.