Patent attributes
Methods, systems, and devices for automated feature selection and model generation are described. A device (e.g., a server, user device, database, etc.) may perform model generation for an underlying dataset and a specified outcome variable. The device may determine relevance measurements (e.g., stump R-squared values) for a set of identified features of the dataset and can reduce the set of features based on these relevance measurements (e.g., according to a double-box procedure). Using this reduced set of features, the device may perform a least absolute shrinkage and selection operator (LASSO) regression procedure to sort the features. The device may then determine a set of nested linear models—where each successive model of the set includes an additional feature of the sorted features—and may select a “best” linear model for model generation based on this set of models and a model quality criterion (e.g., an Akaike information criterion (AIC)).