In one embodiment, a method includes receiving at a network device, video and activity data for a video conference, automatically processing the video at the network device based on the activity data, and transmitting edited video from the network device. Processing comprises identifying active locations in the video and editing the video to display each of the active locations before a start of activity at the location and switch between the active locations. An apparatus and logic are also disclosed herein.