Patent attributes
A method and device for n-gram identification and extraction is disclosed. The method includes identifying at least one n-gram from a sentence inputted by a user based on a confidence score associated with each of the at least one n-gram. The method further includes determining a direction context entropy coefficient for each of the at least one n-gram. The method includes iteratively expanding one or more of the at least one n-gram by the smallest n-gram unit at each iteration in a predefined direction in the sentence to generate at least one expanded n-gram, based on an associated direction context entropy coefficient. The method further includes extracting at each expanding iteration one or more of the at least one expanded n-gram based on an associated confidence score. The method includes grouping semantically linked n-grams from the one or more of the at least one expanded n-gram.