Patent attributes
A method for monitoring and management of a cloud-based computing system is provided. The method includes sending a first data stream to a first pod of a first worker node of a cloud-based computing system. First logs are received from the first pod of the first worker node, which are generated by the first pod while processing the first data stream. A first age of the first pod is determined. In response to the first age being less than a first age threshold, a first failure chance and a first failure timeline are determined for the first pod based on the first logs. In response to the first failure chance being greater than a first failure threshold, a first report is sent to a primary node of the cloud-based computing system, which includes the first failure chance, the first failure timeline, and a first template for the first pod.