Patent attributes
A method for processing a data stream to identify a structure of the data stream includes receiving the data stream a sequence of characters, retrieving a set of rules for encoding characters into at least one token, and parsing the data stream. Parsing includes generating a plurality of tokens according to the set of rules. Each token represents a corresponding portion of the sequence of characters. Parsing includes forming a sequence of tokens from the plurality of tokens and assigning at least one attribute value describing the corresponding portion of the sequence of characters of the corresponding token to which the attribute value is assigned. The sequence of tokens are assigned to a cluster by determining that the sequence of tokens matches a pattern by which the cluster is characterized. The sequence of tokens is merged with the cluster. A representation of the cluster is output.