Patent attributes
There is provided a method and system generate a data dictionary for searching data items stored in an information resource. In one embodiment, the system generates a list of synonyms for keywords entered in search queries to the system. A keyword and synonym form a token pair. Token pairs are evaluated according to a bidirectional divergence value calculated for distributions of search results, wherein the searches are based on the token pairs. Token pairs are then selected based on the divergence value. The selected token pairs are compiled into a data dictionary. In one embodiment, the data dictionary is a synonym dictionary used for user search query expansion to find matching items.