Patent attributes
A system and method are provided for machine learning (ML) quality assurance. The method trains a plurality of agent ML annotation model software applications. Each agent annotation model is trained with a corresponding subset of annotated raw data images including annotation marks forming a boundary surrounding the first shape. A baseline ML annotation model is trained with all the subsets of annotated raw data images. The method accepts an evaluation dataset with unannotated images including the first shape, which is provided to the agent models and baseline models. In response to the evaluation dataset, the agent and baseline models infer predicted images including annotation marks forming a boundary surrounding the first shape. The baseline model predicted images are compared to the predicted images of each agent model for the purpose of determining agent model quality and identifying problematic raw data images for retraining purposes.