CPSC 689 Sketch Recognition Fall '08

Monday, November 17, 2008

Fluid Sketches: Continuous Recognition and Morphing of Simple Hand Drawn Shapes

-Arvo, Novins

COMMENT

SUMMARY

In this paper a new sketching interface is described in which in which raw geometrical strokes are continuously morphed into ideal strokes. The recognition is performed by using least square fits or recognition. The main aim of the authors is to provide immediate and useful feedback. The authors claim that such type of feedback allows users to be sloppier and still gets their sketches recognised correctly.

The authors formulated a family of differential equations that determine how a user drawn shape changes with time due to continuous recognition and morphing. This parametric ODE is referred to as the fluid sketching equation. The shape of the morphing curve may be influenced by many factors in the equation. At each step the algorithm finds the best match within a family of shapes. Different techniques are used for matching to different shapes. For example to fit the curve to a circle a linear least squares method is used, and a relaxation technique is used for fitting the curve to a rectangle.

The system was tested with users who were are graduate students. Each user was asked to reproduce a previously drawn sketch using conventional sketch system and the fluid interface system. The users favoured the conventional system to draw accurately placed shapes but favoured the fluid sketching interface strongly when they had to draw approximately placed shapes. Users also felt the need to have editing features which are currently absent in the system.

DISCUSSION

The main contribution of this paper is the evaluation of an immediate feedback sketch system. Several advantages and shortcomings of the fluid sketching system are studied. One of the problems with the evaluation is that the system was studied on just two types of recognizable shapes - the box and the circle. It remains to be seen how the system would behave with multiple shapes in the domain. System might have to match the drawn stroke to each shape in the domain as each pixel of ink is drawn. This might be computationally expensive.

Sketch Recognition User Interfaces: Guidelines for Design and Development
-Alvardo

COMMENT

SUMMARY

In this paper the author presents a sketch recognition based tool for creating PowerPoint diagrams. The author has also performed evaluation of the prototype using several techniques and established design guidelines for creating SkRUIs. The author also evaluates the utility of several techniques used in iterative design of traditional user interfaces, for development of SkRUIs.

The prototype of their application supports drawing naturally on a separate window. These diagrams are then recognised by the SketchREAD recognizer and the recognized diagrams are then imported to the PowerPoint slides. The recognition is done only when the user completes sketching or when the focus is shifted from the sketching window. The system is not capable of determining automatically whether the user has finished sketching and relies on explicit feedback from the user. The system also supports a number of editing features like move and delete. It was found that providing an explicit modal switch between edit and ink gestures confused the users who often forgot to switch modes. Therefore an online edit mode was developed. The user hovered the pen over the drawn ink diagram and a subsequent cursor change indicated that the system was in edit mode. During formative evaluation users expressed the desire to add annotation symbols without them being recognised. A combo box was provided to indicate whether the recognition was ON.

The system was evaluated by testing it on users who were all graduate students. The users were asked to perform three prototypical diagram creation tasks. After completion of these tasks feedback was taken, based on which several design guidelines were established.

(1) Recognition results should be displayed only after sketching is done; (2) Provide explicit indication between free sketching and recognition. (3) Muti-domains should only be used when system is robust enough; (4)Pen based editing should be used. Sketching and editing should have clearly distinguishable gestures. (5)Large buttons should be used for pen based interface (6)Pen should respond in real time.

DISCUSSION

The paper largely deals with UI issues in Sketch based interfaces. The paper established several important hueristics for SkRUIs. It also points out many shorcomings of using traditional iterative design techniques. As achnowledged by the author modal switches are prone to confuse user. Recognized and un-recognized modes confused the users in the study. An automatic method should be developed to identify modes on the basis of implicit cues, as was done in the case of differentiating between editing and drawing gestures.

Interactive Learning of Structural Shape Descriptions from Automatically Generated Near-miss Examples
-Hammond, Davis

COMMENT

SUMMARY

Structural shape descriptions either provided explicitly by the user or generated automatically by the computer are often over or under constrained. This paper describes a method to debug over and under constrained shapes in LADDER descriptions using a novel active learning technique that generates its own near missed example shapes.

LADDER based systems require the domain designer to provide shape descriptions. An intuitive way to provide description would be the draw and have the computer understand automatically. However these descriptions are often imperfect because of the inability of the computer to understand the intent of the user. The authors developed a visual debugger, that first asks the system to draw a positive example. After this the system generates near-miss examples (one additional or one less constrain) to be classified by the user as negative and positive. One the basis of user classification it removes unintended constraints and adds required constraints. But for this the system first needs to generate near miss examples. An initial set of true constraints is captured. This list is kept small and relevant using a set of heuristics. Each time a positive classification is encountered, the system removes from the list any constraint that is not true of the system. For under-constrained figures we determine a set of constraints which are not in the description. We add the negation of each constraint one by one. If the user gives a negative classification, the constraint is added. Thus the shape description is incrementally perfected.

DISCUSSION

Describing shapes by drawing them is very important from an HCI perspective. This paper provides a method fro enabling users to do this accurately. I was concerned about the size of the initial list of constraints until the authors describe a way to prune this list to include only the relevant ones.

The system also omits disjunctive constraints. A complex shape could easily consist of a Boolean combination of constraints rather then being described by individual constraints. For example two shapes which are mirror images of each other and are laterally asymmetric might need disjunctive constraints to describe them.

Purely from a UI perspective, would it be better to provide a group of shapes (say 10-15) to be classified by the user at once, rather then presenting it one by one.

Tuesday, September 23, 2008

MergeCF

-Wolin, Paulson, Hammond

COMMENTS

SUMMARY

MergeCF uses curvature and speed data to find an initial set of corners. It then eliminates false positives by removing similar corners, merging like stroke segments together, and examining stroke segment's direction values.
After producing an initial fit, that algorithm first checks for corners that are together in close proximity and removes the corner with smallest curvature. It then tends to remove the extraneous points that overfit the stroke. It assumes that the corners surrounding the smallest stroke are likely to be false positives. The smallest segment is found, and it is checked if it can be merged with any of its neighboring segments. It is then merged with the neighboring segment that has the least primitive error when combining the two segments.
MergeCF was tested on 501 unistroke symbols. It had an 'all or nothing' accuracy of 66.7 percent which outperforms the Sezgin and Kim algorithms.

DISCUSSION

The algorithm is a big improvement over the current benchmark algorithms. However, I think there might be better ways to remove false positives. I am using a different approach to remove false positives and it seems to work rather accurately. The exact accuracy has not been calculated yet.

Monday, September 22, 2008

Early Processing for Sketch Understanding

-Sezgin, Stahovich

COMMENTS

SUMMARY

The algorithm described in this paper uses both speed and curvature data to detect corners in a stroke. The underlying idea of using speed is that at corners stroke speed reaches a local minima. Thus corners are typically located where curvature reaches a maximum and speed reaches a minimum. The author uses a technique called average based filtering to eliminate any false positives. Only those extrema are considered where speed and curvature data lie beyond a threshold. False positives are also tackled with hybrid generation scheme. Hybrid generation occurs in three steps - computing vertex certainties, generating a set of hybrid fits and selecting the best fit. The initial fit is the intersection of the corners generated by speed and curvature data. The error is computed as an average of sum of squares of the distances to the fit from each point in the stroke. Additional fits are generated by adding highest scoring (least error) curvature and speed candiates not already in the fit. The algorithm can also be used to produce a fit for curved portion of the stroke. If the ratio of total path distance and euclidean distance between two points is significantly higher than one, that indicates a curved stroke segemnt between these points. Curves are approximated using Bezier curves using two end points and two control points. The approximation and identification of the shapes was correct 96% of the time when tested with ten figures.

DISCUSSION

The algorithm serves as benchmark beacuse it first introduced the idea of using speed data to find corners. The use of Bezier curve approximation makes it very powerful, since it can handle both linear and curved segments of the stroke. However, the accuracy (which ignores false positives) might be a point of contention.

Algorithms for reduction of the number of points required to represent a stroke

-Douglas, Peucker

COMMENTS

SUMMARY

Most digitizers record far more points than are required to represent basic strokes. This paper concerns with reducing this number. In the past some algorithms which do this concentrate on deleting points, whereas others select points. The methods proposed in this paper belong to the latter category. A point between two end points is selected if it is at a greater perpendicular distance than a threshold, otherwise the two end-points are considered to be enough to represent the stroke between them. This idea was implemented in two different ways and tested. The first point is defined as the anchor and the last point as the floater. The intervening points along the stroke are examined to find the one with greatest perpendicular distance from the line formed by the anchor and floater. If this distance is less than a threshold then only the end-points are selected and the rest of the points are discarded. Otherwise, the point having the maximum distance from the line is made the new floater. This cycle continues and once the minimum distance requirement is met, the anchor is moved to floater and the last point on the stroke is re-assigned as the floating point. The second method is different only in one way-the points that have been floaters are stacked in a vector. When the anchor moves to the floater, the new floater is then selected from the top of this stack. Thus, this avoids re-examining all points between the floater and the end point. The algorithm was tested with many data sets and was found to be suitable for simple reduction and abstractions.

DISCUSSION

The algorithm is very simple and I believe can be implemented with a simple recursive procedure. Since the algorithm uses an 'selection of points' approach instead of deletion, so its very efficient in case of strokes which are more abstract and can be represented with few number of feature points.

Saturday, September 20, 2008

Short Straw

-Aaron Wolin, Tracy Hammond

COMMENTS

SUMMARY

ShortStraw is a simple and accurate polyline corner finder. The first step is to resample the points, with the interspacing distance of the resampled points being the diagonal length divided by a constant factor. The 'straw' at point i is then computed as the euclidean distance between points (i-3) and (i+3). Since straw lengths would decrease when the stroke bends around a corner, the point where minima is reached is returned as the corner. After this some higher level processing is done to remove false positives and find missed corners. First every consecutive set of corners goes through a line test. If two points fails line test, more corners are assumed to be lying between them. Threshold is then relaxed and minima of straw length is calculated from among all the points in the middle half of the stroke segment. Thus a new corner is found. This process is repeated until all the consecutive corners pass line tests. A collinear check is then run and if any three consecutive corners are found to be collinear, the middle corner is removed. The algorithm was tested and compared with other benchmark corner finding algorithms. It was by far the most accurate, having an 'all or nothing' accuracy of 74.1 percent.

DISCUSSION

Short Straw algorithms is a perfect example of how complexity doesn't necessarily correlate with accuracy. One limitation compared to other algorithms is that, it restricts itself to polyline strokes. I am just guessing, could we do a second derivative test, to differentiate curves from corners. If we could then we could use this algorithm to work for curved figures as well.