Behavior representation based on motion features is to use human motion feature information, such as optical flow, target motion trajectory, spatio-temporal feature points, etc., to model behavior.
Optical flow is the most widely used motion feature, and its advantage is that it can reflect the motion direction, motion speed, acceleration, etc. of the moving object, and it does not need to assume the background motion when calculating. The figure below demonstrates the change in magnitude of optical flow within a subdivision mesh during a certain behavior. The main disadvantage of optical flow is that it tends to be computationally heavy and susceptible to noise. The acquisition of target motion trajectory requires reliable target tracking, but the tracking of human targets in thermal infrared images will face great difficulties firstly due to the weak nature of the image itself, and secondly due to the randomness of human motion, behavior diversity and scene diversity. The analysis of target trajectory characteristics is often very complicated and requires more prior knowledge. Relatively speaking, spatiotemporal feature points such as 3D Harris corners, 3D scale invariant feature transform (3D SIFT), and cube gradient descriptors have attracted more attention from researchers due to their better practicability. The advantage of spatio-temporal feature points is that the extraction is not sensitive to factors such as changes in lighting conditions and target sequence alignment deviations. The disadvantage is that the obtained feature points only correspond to local body regions with severe movement changes, and cannot fully reflect the apparent information of human body movement. Therefore, the recognition effect will be affected to a certain extent, and the calculation amount is relatively large.