US Patent 10592767 Interpretable counting in visual question answering

Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10592767 Interpretable counting in visual question answering

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10592767 Interpretable counting in visual question answering