US Patent 11983171 Using multiple trained models to reduce data labeling efforts

A method of labeling a dataset includes inputting a testing set comprising a plurality of input data samples into a plurality of pre-trained machine learning models to generate a set of embeddings output by the plurality of pre-trained machine learning models. The method further includes performing an iterative cluster labeling algorithm that includes generating a plurality of clusterings from the set of embeddings, analyzing the plurality of clusterings to identify a target embedding with a highest duster quality, analyzing the target embedding to determine a compactness for each of the plurality of clusterings of the target embedding, and identifying a target cluster among the plurality of clusterings of the target embedding based on the compactness. The method further includes assigning pseudo-labels to the subset of the plurality of input data samples that are members of the target duster.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11983171 Using multiple trained models to reduce data labeling efforts

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11983171 Using multiple trained models to reduce data labeling efforts