Patent attributes
The present disclosure involves systems, software, and computer implemented methods for distributed monitoring in clusters with self-healing. One example method includes determining, by a monitoring agent of a first node of a cluster, a self-monitoring check to perform for the first node. The first node is among multiple, other nodes included in the cluster. In response to receiving a successful status for the self-monitoring check, a registry in the first node is updated with the successful status. The registry includes node statuses for each node in the cluster. In response to receiving an unsuccessful status for the self-monitoring check, the monitoring agent performs at least one corrective action on the first node and updates the registry in the first node with a result of the at least one corrective action. The registry is broadcasted to each of the other nodes in the cluster as an updated registry.