Patent attributes
Generally described, one or more aspects of the present application relate to scaling a cluster of compute capacity used to execute containerized applications or tasks. For example, a waiting area can be maintained, in which tasks that are requested to be executed in a cluster but are not able to be accommodated in the cluster due to the cluster not having sufficient compute capacity usable to execute such tasks are stored. The scaling of the cluster can be performed based on the characteristics of the tasks in the waiting area, such that the cost associated with adding too much compute capacity to the cluster can be reduced, while also reducing the time it takes to reach the desired level of compute capacity that can accommodate all of the requested tasks.