Patent attributes
Implementations of the present disclosure include methods, systems, and computer-readable storage mediums for receiving first and second data sets, both the first and second data sets including structured data in a plurality of columns, for each of the first data set and the second data set, inputting each column into an encoder specific to a column type of a respective column, the encoder providing encoded data for the first data set, and the second data set, respectively, providing a first multi-dimensional vector based on encoded data of the first data set, providing a second multi-dimensional vector based on encoded data of the second data set, and outputting the first multi-dimensional vector and the second multi-dimensional vector to a loss-function, the loss-function processing the first multi-dimensional vector and the second multi-dimensional vector to provide an output, the output representing matched data points between the first and second data sets.