A virtual collaboration system receives input video data including a participant. The system analyzes the input video data to identify a gesture or a movement made by the participant. The system selects an overlay image as a function of the gesture or the movement made by the participant, incorporates the overlay image into the input video data, thereby generating output video data that includes the overlay image, and transmits the output video data to one or more participant devices.