Patent attributes
Script and command line exploitation detection is described. Initially, an exploitation detection system collects data describing scripts and command lines launched by various computing devices. The exploitation detection system clusters the scripts and command lines based on a measure of similarity, namely, Bilingual Evaluation Understudy (BLEU) score. Given the clusters and the data describing the scripts and command lines, the exploitation detection system generates encodings of the scripts and command lines for input to a machine learning model, e.g., an autoencoder. From this model, the exploitation detection system receives a measure of unlikeliness that a process corresponding to a given script or command line launches it. The exploitation detection system ranks the scripts and command lines according to the measure of unlikeliness. In this way, the exploitation detection system can display indications of the scripts and command lines that are most unlikely to be launched by their respective process.