A computer-interfaced camera system identifies and tracks groups of socially interrelated people. The system can be used, for example, to track people as they wait in a checkout line or at a service counter. In a preferred implementation, each recorded camera frame is segmented into foreground regions containing several people. The foreground regions are further segmented into individuals using temporal segmentation analysis. Once an individual person is detected, an appearance model based on color and edge density in conjunction with a mean-shift tracker is used to recover the person's trajectory. Groups of people are determined by analyzing inter-person distances over time.