Patent attributes
A system and method for analyzing documents, such as posts, on-line reviews and comments from people based on topics of the documents, to determine general sentiment of users is disclosed. Topics from the documents and their corresponding sentiment polarities are extracted. The documents are regarded to be constituted by a series of topics. The sentiment for a topic is represented by a quadruple (k, so, h, i), where k is the topic, so is the sentiment opinion, h is the comment or post holder, and i is the document. A quintuple (k, sup, p, n, ne) is used to illustrate the topics and corresponding sentiments and is stored in S, where sup indicates the frequency of the topic, and p (positive), n (negative) and ne (neutral) are different types of opinions of the users. From the quintuple set S, every topic is related to three kinds of sentiment opinions (positive, negative, and neutral), enabling determination of popular topics in documents as well as the users' sentiment polarities.