Patent 10445062 was granted and assigned to Oracle on October, 2019 by the United States Patent and Trademark Office.
The present disclosure relates to techniques for analysis of data from multiple different data sources to determine similarity amongst the datasets. Determining a similarity between datasets may be useful for downstream processing of those datasets for different uses. A graphical interface may be provided to display detailed results including: a similarity prediction, data similarity prediction, column order similarity prediction, document type similarity prediction, prediction of overlapping or related columns, orphaned column prediction (e.g., a left orphaned column or a right orphaned column). Detecting similarities may be useful for leveraging prior data transformations generated for the datasets that are analyzed.