Patent attributes
A computer-implemented method includes identifying duplicate items of data in a dataset on which a computation task is to be performed by segmenting the dataset into multiple segments and performing a deduplication operation on each of the multiple segments, as well as removing the duplicate items of data in the dataset from the computation task. Such a method also includes performing the computation task on the remaining items of data in the dataset, wherein the remaining items of data comprise unique items of data in the dataset, and aggregating the results of the computation task and memoized computation results corresponding to the duplicate items of data to generate a complete computation result for the dataset. Further, such a method includes outputting the complete computation result for the dataset to a user.