Patent attributes
Predictive big data capacity planning is described. An example includes instructions for receiving workload data and computing operation data related to workload processing for a customer in a computing infrastructure, the computing infrastructure including one or more clusters, the one or more clusters including one or more data nodes; analyzing the received data to identify relationship information between the workload data and the computing operation data; performing predictive analytics to identify a significant value that relates to performance variations in workload performance or usage pattern characteristics for data growth scale factors in the computing infrastructure; generating a knowledge base based at least in part on the predictive analytics; training a machine learning model based at least in part on the knowledge base; and utilizing the trained machine learning model to generate a computing infrastructure configuration recommendation for the customer.