Techniques are provided for machine learning based query optimization for federated databases. An exemplary method comprises obtaining a query to be processed in a federated database; generating at least one predictive data movement instruction to move data to a target data source when the target data source satisfies one or more of a predefined efficiency criteria with respect to a query type of the query and a predefined capacity criteria at an expected execution time of the query; and generating a query execution plan for the query by calculating a cost of execution for a plurality of potential target data sources and selecting a target data source for the query based on the calculated cost of execution. The federated database optionally employs a dynamic federated query schema.