Techniques for configuring and training an ensemble predictor for click probability of content on search engine results pages. In an aspect, a first stage machine learning algorithm, such as a neural network, is trained using a first training data set. The output of the trained first stage algorithm may be coupled to a second stage machine learning algorithm to form an ensemble predictor. In another aspect, the ensemble predictor is trained using a second training data set, using the output of the first stage algorithm to initialize a priori settings of the second stage algorithm.