Patent attributes
Density estimation and/or manifold learning are described, for example, for computer vision, medical image analysis, text document clustering. In various embodiments a density forest is trained using unlabeled data to estimate the data distribution. In embodiments the density forest comprises a plurality of random decision trees each accumulating portions of the training data into clusters at their leaves. In embodiments probability distributions representing the clusters at each tree are aggregated to form a forest density which is an estimate of a probability density function from which the unlabeled data may be generated. A mapping engine may use the clusters at the leaves of the density forest to estimate a mapping function which maps the unlabeled data to a lower dimensional space whilst preserving relative distances or other relationships between the unlabeled data points. A sampling engine may use the density forest to randomly sample data from the forest density.