Patent attributes
Techniques are described herein for generating adaptive recommendations in response to a content request. The system herein detects abrupt changes and leverages the seasonality of a reward function. A collection of contextual models are utilized, each one learning about one of the unique reward stationary states. A short-term memory model is used to detect reward shifts toward stationary periods that have not occurred in the past. In this case, a new base bandit instance is initialized. In order to perform the change point detection, at each step every model gets assigned a score indicating how likely the last observation is to come from a corresponding stationary period represented by a respective model. A model is selected based on the scores. The model provides a recommendation and the system can monitor clickstream data to identify the reward for providing the recommendation.