Patent attributes
Methods, and systems, including computer programs encoded on computer-readable storage mediums, including a method for identifying quotations occurring in resources. The method includes identifying first and second quotations that occur in particular resources in a set of resources, each particular resource being classified as a quotation-related resource; determining, for each of the first and second quotations, a number of occurrences of the quotation in the set and a number of different resources in the set in which the quotation occurs; determining that the first quotation and the second quotation are (i) semantically related and (ii) not identical; selecting a representative quotation from among the first quotation and the second quotation; and storing the representative quotation, the number of occurrences of the representative quotation and the number of different resources in which the representative quotation occurs in association with an entity to which the representative quotation is attributed.