Patent attributes
A harvester is disclosed for harvesting metadata of managed objects (files and directories) across file systems which are generally not interoperable in an enterprise environment. Harvested metadata may include 1) file system attributes such as size, owner, recency; 2) content-specific attributes such as the presence or absence of various keywords (or combinations of keywords) within documents as well as concepts comprised of natural language entities; 3) synthetic attributes such as mathematical checksums or hashes of file contents; and 4) high-level semantic attributes that serve to classify and categorize files and documents. The classification itself can trigger an action in compliance with a policy rule. Harvested metadata are stored in a metadata repository to facilitate the automated or semi-automated application of policies.