US Patent 8229876 Expediting K-means cluster analysis data mining using subsample elimination preprocessing

Improved efficiencies of data mining clustering techniques are provided by preprocessing a sample set of data points taken from a complete data set to provide seeds for centroid calculations of the complete data set. Such seeds are generated by selecting a uniform sample set of data points from a set of multi-dimensional data and then seed values for the cluster determination calculation are determined using a centroid analysis on the sample set of data points. The number of seeds calculated corresponds to a number of data clusters expected in the set of multi-dimensional data points. Seed values are determined using subsample elimination techniques.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 8229876 Expediting K-means cluster analysis data mining using subsample elimination preprocessing

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 8229876 Expediting K-means cluster analysis data mining using subsample elimination preprocessing