Patent attributes
Methods and systems are described for providing hierarchical demand forecasting for state space reduction. Using a hierarchical architecture, a base model may be trained to capture a range of shared structure in a first data set that can be used to draw inferences on using smaller sets of data representative of the “whole picture.” For example, training data may be sampled and prepared and used to train a base model in a first stage. In a next stage, one or more downstream models may be trained on the structure and samples of uncensored demand generated by the base model to produce forecasts for items and locations, including items and locations for which there may be little or no historical data. The downstream models that would otherwise require a large amount of data for training can be generated on demand using less training data, training time, computing processing, and memory.