Information is gathered on microservice interactions. Two or more microservice failures are detected. For each microservices failure, a microservice restoration time is determined. An expected total cost of a downtime for each microservice is determined. Based on the determined expected total cost of the downtime for each microservice, an order of microservices to restore is determined.