Patent attributes
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transformation for machine learning pre-processing. In some implementations, an instruction to create a model is obtained. A determination is made whether the instruction specifies a transform. In response to determining that the instruction specifies a transform, a determination is made as to whether the transform requires statistics on the training data. The training data is accessed. In response to determining that the transform requires statistics on the training data, transformed training data is generated from both the training data and the statistics. A model is generated with the transformed training data. A representation of the transform and the statistics is stored as metadata for the model.