Patent attributes
A database including various datasets and metadata associated with each respective dataset is provided. These datasets were used to train predictive models. The database stores a performance value associated with the model trained with each dataset. When provided with a new dataset, a server can determine various metadata for the new dataset. Using the metadata, the server can search the database and retrieve datasets which have similar metadata values. The server can narrow the search based on the performance value associated with the dataset. Based on the retrieved datasets, the server can recommend at least one sampling technique. The sampling technique can be determined based on the one or more sampling techniques that were used in association with the retrieved datasets.