Patent attributes
Disclosed embodiments provide techniques for log file manipulation detection. Log file terms are identified in a set of known good log files. A frequency metric is computed for the log file terms, and one or more clusters are formed that represent the terms and their corresponding frequency metric values within the set of known good log files. New log files are then obtained from an operational computer system. The frequency metric for those terms in the new log files are computed, and checked against the established clusters. A score is computed based on how similar the new log files are to the set of known good log files by comparing the frequency metric for terms in the new log file to the data in the previously obtained cluster(s). In response to a score exceeding a predetermined threshold, one or more mitigation actions are taken.