Patent 11295229 was granted and assigned to Amazon on April, 2022 by the United States Patent and Trademark Office.
An approximate count of a subset of records of a data set is obtained using one or more transformation functions. The subset comprises records which contain a first value of one input variable, a second value of another input variable, and a particular value of a target variable. Using the approximate count, an approximate correlation metric for a multidimensional feature and the target variable is obtained. Based on the correlation metric, the multidimensional feature is included in a candidate feature set to be used to train a machine learning model.