Example implementations include a method, apparatus and computer-readable medium for controlling a camera, comprising receiving a video sequence of a scene. The method includes determining one or more scene description metadata in the scene from the video sequence. The method includes identifying one or more scene object types in the scene based on the one or more scene description metadata. The method includes determining one or more rules based on one or both of the scene description metadata or the scene object types, each rule configured to generate an event based on a detected object following a rule-specific pattern of behavior. The method includes applying the one or more rules to operation of the camera.