Patent attributes
One embodiment of the present invention provides a system that facilitates disassembling a structure tree containing structure information for a document. During operation, the system assigns unique identifiers to nodes in the structure tree. The system also selectively labels each node in the structure tree with a unique pathname from the root of the structure tree, wherein in the pathname specifies the position of the node in the structure tree. Next, the system merges nodes from the structure tree into components of the document, which contain content items for the document, instead of storing the structure tree separately from the components. In this way, the components can be incorporated into or extracted from the document without losing associated structure information for the document.