Patent attributes
Systems and methods are disclosed to implement a bounded group by query system that computes approximate time-sliced statistics for groups of records in a dataset according to a group by query. In embodiments, a single pass scan of the dataset is performed to accumulate exact results for a maximum number of groups in a result grouping structure (RGS) and approximate results for additional groups in an approximate result grouping structure (ARGS). RGSs and ARGSs are accumulated by a set of accumulator nodes and provided to an aggregator node, which combines the received structures to generate exact or approximate statistical results for at least a subset of the groups in the dataset. Advantageously, the disclosed query system is able to produce approximate results for at least some of the groups in a single pass of the dataset using size-bounded data structures, without predetermining the actual number of groups in the dataset.