Patent attributes
A language for semi-structured documents, XML has emerged as the core of the web services architecture, and is playing crucial roles in messaging systems, databases, and document processing. However, the processing of XML documents has a reputation for poor performance, and a number of optimizations have been developed to address this performance problem from different perspectives, none of which have been entirely satisfactory. Parallel XML parsing leverages the growing prevalence of multicore architectures in all sectors of the computer market, and yields significant performance improvements. The design consists of an initial preparsing phase to determine the structure of the XML document (or other data document), followed by a full, parallel parse. The results of the preparsing phase are used to help partition the XML document for data parallel processing. The parallel parsing phase is, for example, a modification of the libxml2 XML parser, which demonstrates that the approach applies to real-world, production quality parsers. Empirical study shows the parallel XML parsing algorithm can improve the XML parsing performance significantly and scales well.