A method of searching a plurality of data files, wherein each data file includes a plurality of features. The method: determines a plurality of feature groups, wherein each feature group includes n features and n is an integer of 2 or more; expresses each data file as a file vector, wherein each component of the vector indicates the frequency of a feature group within the data file, wherein the n features which constitute a feature group do not have to be located adjacent to one another; expresses a search query using the feature groups as a vector; and searches the plurality of data files by comparing the search query expressed as a vector with the file vectors.