Patent attributes
In accordance with one embodiment of the present disclosure, a method for determining the similarity between a first data set and a second data set is provided. The method includes performing an entropy analysis on the first and second data sets to produce a first entropy result, wherein the first data set comprises data representative of a first one or more computer files of known content and the second data set comprises data representative of a one or more computer files of unknown content; analyzing the first entropy result; and if the first entropy result is within a predetermined threshold, identifying the second data set as substantially related to the first data set.