US Patent 10572478 Multiple query optimization in SQL-on-Hadoop systems

To reduce the overall computation time of a batch of queries, multiple query optimization in SQL-on-Hadoop systems groups multiple MapReduce jobs converted from queries into a single one, thus avoiding redundant computations by taking sharing opportunities of data scan, map function and map output. SQL-on-Hadoop converts a query into a DAG of MapReduce jobs and each map function is a part of query plan composed of a sequence of relational operators. As each map function is a part of query plan which is usually complex and heavy, disclosed method creates a cost model to simulate the computation time which takes both I/O cost for reading/writing input file and intermediate data and CPU cost for the computation of map function into consideration. A heuristic algorithm is disclosed to find near-optimal integrated query plan for each group based on an observation that each query plan is locally optimal.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10572478 Multiple query optimization in SQL-on-Hadoop systems

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10572478 Multiple query optimization in SQL-on-Hadoop systems