Patent attributes
Data is classified using corrected semi-supervised data. Cluster centers are defined for unclassified observations. A class is determined for each cluster. A distance value is computed between a classified observation and each cluster center. When the class of the classified observation is not the class determined for the cluster center having a minimum distance, a first distance value is selected as the minimum distance, a second distance value is selected as the distance value computed to the cluster center having the class of the classified observation, a ratio value is computed between the second distance value and the first distance value, and the class of the classified observation is changed to the class determined for the cluster center having the minimum distance value when the computed ratio value satisfies a label correction threshold. A classification matrix is defined using corrected observations to determine the class for the unclassified observations.