A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people, and groups. A novel, adaptive background subtraction method that combines color and gradient information is used to cope with shadows and unreliable color cues. People are tracked through mutual occlusions as they form groups and separate from one another. Strong use is made of color information to disambiguate occlusion and to provide qualitative estimates of depth ordering and position during occlusion. Simple interactions with objects can also be detected. The system is tested using both indoor and outdoor sequences. It is robust and should provide a useful mechanism for bootstrapping and reinitialization of tracking using more specific but less robust human models.