This chapter describes the most common features and definitions from the sociological science used to detect and track groups of people that are interacting. The necessity of having reliable algorithms to cope with these problems is gaining increasing interest, especially in the fields related to security and video surveillance. Answering the question of “who is present and with whom he/she is interacting in a scene?” is nowadays of utmost importance. Other domains require having good algorithms to face these problems, for example, activity recognition, social robotics, and automatic behavior analysis. The success of detection and tracking algorithms relies on the engineering of the features. In this context, the literature of sociological sciences gives us a set of well-established assumptions and constraints to create more reliable and plausible features and detection algorithms. In this chapter we will describe the existing features of the following two categories: the low-level category used to determine the spatial properties of each person in a scene (person position and head/body orientation), and the high-level category that agglomerates or uses the low-level features to implement sociological and biological definitions (frustum of visual attention). We will see how these features are used by the popular methods of group detection, such as game theory-based and probabilistic approaches. Finally, we will analyze a tracking model that can be integrated with the analyzed features and the described detection methods. The experimental part provides a comprehensive comparison of the performances of different algorithms to detect and track groups on standard and publicly available benchmarks.
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.