Patent attributes
A machine learning apparatus according to an embodiment includes a feature extractor configured to extract features from an object region of an image, a label processor configured to create sentence label embeddings from a sentence label corresponding to the object region, a first training data creator to extract first sub-features from a plurality of first sub-regions created by partitioning the object region, add the sentence label embeddings to the extracted first sub-features, and add the first sub-features added with the sentence label embeddings to the features of the object region, a second training data creator to extract a plurality of second sub-regions along a bounding surface of the object region, create an attention matrix from the second sub-regions, and create a training data by applying the attention matrix to the features of the object region, and a trainer to train an object detection model using the training data.