Performing an operation comprising transforming an input dataset to a predefined format, extracting, from the transformed dataset, a plurality of features describing the transformed dataset, and generating, by a machine learning (ML) algorithm executing on a processor and based on an ML model, a plurality of rules for modifying the transformed dataset to conform with a first data model.