Video data of human head poses and facial expressions, collected indoors across diverse daily and work environments (e.g., office, meeting room, home, dormitory, corridor). Each participant records one video, with the portrait framed approximately at head-and-shoulder size. The recorded content includes head movements (up, down, left, right) and mouth actions (open, close), combined into various pose–action sequences. Lighting conditions cover common scenarios such as normal light, low light, and backlight, ensuring that facial details remain clearly visible.