Patent attributes
A system includes a production server, a backup server, a telemetry analyzer, a memory, and a hardware processor. The telemetry analyzer takes snapshots of various performance metrics of the production server. The memory stores a log of previous disasters that occurred on the production server. The log includes a snapshot of the production server performance metrics from the time each disaster occurred. The memory also stores recovery scripts for each logged disaster. Each script provides instructions for resolving the linked disaster. The hardware processor uses a machine learning architecture to train an autoencoder. The trained autoencoder receives new snapshots from the telemetry analyzer and generates a reconstruction of the new snapshots. The hardware processor then determines a threshold for distinguishing between server disasters and minor anomalies. This distinction is made by comparing the difference between the reconstruction of the new snapshots and the new snapshots with the threshold.