Patent attributes
A system, method, and computer program includes a communications interface configured to receive a set of industry reports from multiple industry sources, and circuitry to compare one or more attributes of at least two trade lines to identify whether the at least two trade lines are duplicates. The circuitry characterizes as a binary indication whether the comparing indicates the one or more attributes are a match, and display a representation of the binary indication and receive a user-identified indication whether the at least two trade lines are duplicates. The circuitry trains a classifier, records the indication whether the at least two trade lines are duplicates and removes at least one of the at least two trade lines from the set of industry reports, and runs the classifier. Subsequently, a supervised machine learning classifier is trained to fit on the training data and is evaluated for accuracy of the testing data.