Patent attributes
An object detection system may generate regions of interest (ROIs) from an input image that can be processed by a wide range of object detectors. According to the techniques described herein, an image is processed by a light-weight neural network (e.g., a heatmap network) that outputs object center and object scale heat-maps. The heatmaps are processed to define ROIs that are likely to include objects. Overlapping ROIs are then merged to reduce the aggregate size of the ROIs, and merged ROIs are downscaled to a reduced set of pre-defined resolutions. Fully-convolutional, high-accuracy object detectors may then operate on the downscaled ROIs to output accurate detections at a fraction of the computations by operating on a reduced image. For example, fully-convolutional, high-accuracy object detectors may operate on a subset of the entire image (e.g., cropped images based on ROIs) thus reducing computations otherwise performed over the entire image.