Patent attributes
A method for context-aware data mining of a text document includes receiving a list of words parsed and preprocessed from an input query; computing a related distributed embedding representation for each word in the list of words using a word embedding model of the text document being queried; aggregating the related distributed embedding representations of all words in the list of words to represent the input query with a single embedding, by using one of an average of all the related distributed embedding representations or a maximum of all the related distributed embedding representations; retrieving a ranked list of document segments of N lines that are similar to the aggregated word embedding representation of the query, where N is a positive integer provided by the user; and returning the list of retrieved segments to a user.