Patent attributes
A system and method for classifying text includes a pre-processor, a knowledge base, and a statistical engine. The pre-processor identifies concepts in the text and creates a structured text object that contains the concepts. The structured text object is then passed to a statistical engine, which applies statistical information provided in nodes of a knowledge base to the structured text object in order to calculate a set of match scores, each match score representing the relevance of the text to an associated one of a plurality of predefined categories. The pre-processor may be implemented in the form of an interpreter which selects and executes a script that includes language- and scenario-specific instructions for performing linguistic and semantic analysis of the text.