Patent attributes
The present invention relates generally to the field of video-camera systems, such as a video conferencing systems, and more particularly to video camera targeting systems that locate and acquire targets using an input characterizing a target and a machine-classification system to assist in target acquisition responsively to that input. In some embodiments, the characterization and classification are employed together with one or more inputs of other modalities such as gesture-control. In one example of the system in operation, an operator is able to make pointing gestures toward an object and, simultaneously speak a sentence identifying the object to which the speaker is pointing. At least one term of the sentence, presumably, is associated with a machine-sensible characteristic by which the object can be identified. The system captures and processes the voice and gesture inputs and re-positions a PTZ video camera to focus on the object that best matches both the characteristics and the gesture. Thus, the PTZ camera is aimed based upon the inputs the system receives and the system's ability to locate the target by its sensors.