Examples described herein provide a computer-implemented method that includes training a machine learning model. The model is trained by generating a set of training queries using at least one of a query workload and relationships between tables in a database, building a query graph for each of the set of training queries, computing, for each training query of the set of training queries, a selectivity based at least in part on the query graph, and building, based at least in part on the set of training queries, an initial join result distribution as a collection of query graphs.