Patent 11200514 was granted and assigned to Sas (company) on December, 2021 by the United States Patent and Trademark Office.
Unclassified observations are classified. Similarity values are computed for each unclassified observation and for each target variable value. A confidence value is computed for each unclassified observation using the similarity values. A high-confidence threshold value and a low-confidence threshold value are computed from the confidence values. For each observation, when the confidence value is greater than the high-confidence threshold value, the observation is added to a training dataset and, when the confidence value is greater than the low-confidence threshold value and less than the high-confidence threshold value, the observation is added to the training dataset based on a comparison between a random value drawn from a uniform distribution and an inclusion percentage value. A classification model is trained with the training dataset and classified observations. The trained classification model is executed with the unclassified observations to determine a label assignment.