COMMENTS
SUMMARY
The paper aims to (1) present an easy to implement gesture recognition algorithm, especially for UI prototypers;(2) to empirically compare it to more advanced algorithms (3) to give insight into which user interface gestures are best.
The algorithm was built with several guidelines in mind. For example, the algorithm must be resilient to variations in sampling, rotation, should require no advanced mathematics etc. The authors have described the algorithm in four steps.
The gesture data points are first resampled at a defined rate to make the data independent of sampling rate of a particular hardware. The gesture points are then rotated to align them with the template gesture. Brute force could be used to get all possible rotations and take the one that has maximum alignment. But the authors claim that rotating the gesture so that its indicative angle (angle between centroid of gesture and gesture's first point) is at zero gives the best alignment. The gesture is then scaled to a reference square and translated to a reference point.
Step four does the actual recognition. The candidate gesture is compared to each stored template, to find the average distance between the corresponding points. The template with least path distance is the result of recognition. The minimum path distance is then converted into a score. One of the limitation of the algorithm is that it cannot distinguish gestures whose identities depend on specific orientations, aspect ratios or locations.
The authors conducted an evaluation using 4800 gestures collected from 10 subjects. The comparison was made with two popular recognizers- Rubine classifier and DTW. DTW and 1$ were found to be very accuarate. Rubine was comparatively less successful.
In future, authors plan to conduct studies on programming ease of their algorithm. Further empirical analysis may help in making better algorithmic choices.
DISCUSSION
The obvious advantage with this algorithm is how simply it can be implemented, without any advanced mathematics. It uses a simple classifier by taking average distance of spatial coordinates. This simplicity might be a disadvantage too. As admitted by authors it fails to differentiate on the basis of features like aspect ratio. Rotational invariance which is discussed as an advantage could also prove to be a disadvantage. The system might not be able to differentiate between UP and DOWN arrows. One thing could be done to remedy this. There could be a threshold on how much rotation can we apply to align or else we could put a penalty as the amount of rotation required to align increases.

No comments:
Post a Comment