An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.