The purpose of this paper is to describe one-shot-learning gesture recognition systems developed on the ChaLearn Gesture Dataset. We use RGB and depth images and combine appearance (Histograms of Oriented Gradients) and motion descriptors (Histogram of Optical Flow) for parallel temporal segmentation and recognition. Quadratic-chi distance family is used to measure dierences between histograms to capture cross-bin relationships. We also propose a new algorithm for trimming videos | to remove all the unimportant frames from videos. Our two methods both outperform other published methods and help narrow down the gap between human performance and algorithms on this task.
- Changes to previous version:
Initial Announcement on mloss.org.
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.