Patent attributes
The subject disclosure presents a natural language processing engine for analyzing an input sentence comprising one or more clauses, and generating a plurality of semantic structures for the sentence and the component clauses, based on statistically parsing an input sentence to generate a syntactic structure of the input sentence, examining the syntactic structure of phrases and subordinate clauses within the input sentence, and generating tuples representing a subject, verb, object, indirect object, supplement, type, etc. Each part of the tuple is a reference to an entity in an external knowledge base. Disclosed operations include linking a plurality of entities identified in the syntactic structure with corresponding entities found in an external knowledge base, operating a co-reference resolution, filtering the references from mentioned entities to external entities by semantic relations, and exporting the set of output tuples.