2010年12月15日星期三

Tahuti: A Geometrical Sketch Recognition System for UML Class Diagrams

Comment on
Youyou
Summary:
This paper introduce a multi-layer recognition system. Since it is based on the geometrical properties of each stroke of multi-stroke, it doesn't require users to draw multi-stroke in specific way all the time.
The multi-layer frame of this system including four stages:(1) preprocessing, each stroke is recognized as some kind of basic shapes immediately after it is drawn (2) Selection, strokes are combined together in different ways and the results are checked whether are recognizable or editing command. (3) Recognition, combination of strokes are tested to see whether they corresponds to certain viewable objects and editing command. (4) Identification, give the final interpretation of the multi-stroke.
This paper then introduces many algorithm of recognizing rectangles, ellipse, arrow, deletion, movement and text.
The result shows that uers find Tahuti useful to a paint programs and to Rational Ross.
Discussion:
The system works good for editing UML diagram. Most importantly, the system doesn't restrict the way people draw when to recognize multi-stroke. The fact that UML diagrams always are composed by lines makes the recognition not so difficult.

Reading #29: Scratch Input Creating Large, Inexpensive, Unpowered and Mobile Finger Input Surfaces

Comments:
Wenzhe
Summary:
This paper aims to develop an acoustic-base gesture recognizer, which depends on the different feature of the sound when drawing different shapes. This system is easy to implemented into mobile device according to its relatively small size and therefore is suitable for many surfaces like walls, tables or colthes, etc.
This system extract peak counts and amplitued from the sound of each gesture and use a decision tree to make the recognition decision.
The tests show this system reach an accuracy of 89% when used on 6 gestures.
Discussion:
This paper is related with one of the topic of project 3. The idea of using sound is a creative idea. It's difficulty is we need to deal with the noise caused by surrounding enviroment and from the action of drawing itself. Two team of our class show some interesting result of doing similar experiment.

Reading #28: iCanDraw? – Using Sketch Recognition and Corrective Feedback to Assist a User in Drawing Human Faces

Comments:
Wenzhe
Summary:
This paper introduces a system that can help novices to draw people's faces. This is done by giving direction and feedback to users and help them to imitate the original image. This system first use a face recognition to generate some features of the face image. It use sketch recognition to compare the sketch with the undergoes image modle. According to the developing cycle and user studying, this paper proposed 9 design principles to in order to offer useful intructions to users. User studies show the system is helpful for naive users to draw faces with the help of step by step instruction.
Dicussion:
It is a good example of Human Computer Interface which shows the importance of interaction between users with system. This paper also focus on the convinience when using the system by dealing with the problem of "when to show the information" and " what information to show." On this aspect, this paper remind me of the paper of "Those looks similar"

Reading #27 K-sketch: A 'Kinetic' Sketch Pad for Novice Animators

Comment on:
JJ
Summary
K-Sketch is a general-purpose, informal, 2D animation system which can help novice users to easily creat animations. This paper first interviews some experienced animator in order to find how an informal tool can helps their work. Then, they conducts some interviews with non-animators to find their needs when they want to creat their own animations.
The authors defined 18 animation operations. From these 18 operations, they choose 9 from them in order find a balance between reaction time and the powerof this system. Experiments show K-Sketch is stronger than PowerPoint.

Discussion:
It is a great paper showing the full process of developing a useful system: doing researchs on the needs of aimed users, After that, developing a system that fulfill those requirment with a balance between reaction time and number of functions. At last, doing some experiments to measure the performance or compare it with the performance of other related systems.

Reading #26: Picturephone: A Game for Sketch Data Capture

Comment on
JJ
Summary:
This paper introduce a sketch date collecting game Picturephone which is inspired by a children's game Telephone. Three users are needed to play this game and the system define three and randomly assign one to each user; Draw mode ask a user to draw the sketch according to the description. Describe mode asks a user to describe a sketch. And rate mode ask a user to evaluate each pair of sketches.
During the palying of the game, sketch data is collected for researchers.
Picturephone is an asynchronous system which means it doesn't require the users to play the game simutaneously.

Discussion:
The idea of making the work of collecting data interesting is a very clever idea. For many participant, the process of the data collection is very boring. Comparing with their system, the data collection process of our final project is so boring and after the data collection participant normally feels tired and don't want to do it again. We need to learn from this lesson for next system.

Reading #25 A Descriptor for Large Scale Image Retrieval Based on Sketched Feature Lines

Comment on
Wenzhe
Summary:
This paper introduce a method to retrieve images based on the sketch users draw to imitate the image he or she want to find. The system use a tensor-based descriptor to search an image in a large number of images that is similar with the sketch. In order to do this, the system solve the problem of asymmetry  between a binary sketch input and a full color image.This paper compare the method with MPEG-7 edge histogram descriptor and find it is slightly better than it.

Discussion:
I really like the idea of this paper of how to search a image in a database with huge size. Searching by just drawing a sketch is a very natural way. I have seen some image seaching engines on line that the input is a image and the searching result is some images that are similar with it. Both of them are very interesting. However, different user has different style of drawing to describe a image. In order to make this system more robust we need to think about this problem.

Reading #24 Games for Sketch Data Collection

Summary:

The paper introduces two systems to collect sketching data for research purpose: Picturephone and Stellasketch. The difference between them is that Picturephone is asynchronous game while Stellasketch is a synchronous one.

Picturephone collects long sentences that describe sketches. The system define three and randomly assign one to each user. Draw mode ask user to draw the sketch according to the description. Describe mode asks users to describe a sketch. And rate mode asks user to evaluate each pair of sketches.

Stellasketch is syncrhronous since it gathers short noun-phrase when they are made. Two users plays together without knowing each others work. One draw a sketch based on the nouns and the other describe the drawn sketch with noun-phrase.

Discussion:
This paper makes the collection of data more interesting for participat which is helpful in collecting more appropariate data. It is a good job of HCI. However, we still need to deal with the noise of the data and figure out how to use this date in practice.

Reading #23 InkSeine: In Situ Search for Active Note Taking

Summary:
InkSeine is pen-based active note taking system. It encourage users to engage in active note taking. It has four important design properties: it leverages
preexisting ink to initiate a search and provides tight coupling of search queries with application content; it persists search queries as first class
objects that can be commingled with ink notes and it enables a quick and flexible workflow where the user may freely interleave inking, searching, and
gathering content.
Discussion:
This system improve the note taking for pend tablets. And it is better than other similar system by supporting more function and more user-friendly. However, this paper doesn't offer much about the accuracy. If the recognition is wrong, this system will turn out to be not very helpful even with this good idea and interface.

2010年12月14日星期二

Reading #22 Plushie: An Interactive Design System for Plush Toys

Summary:
In this paper, an interactive system is introduced that help users to design complex 3D plush toys.The users only need to draw the silhouette of the 3D shape. Some gestures are also supported to facilitate editing. Realtime feedback are provided by the system to allow user see the initial result of his or her design. The experiment shows that this system is welcomed by both professional designers and novice users.

Discussion:
This system does a good job on interacting with user and aidding them to draw shapes that are hard for them without this system. Like the Reading #21, it is also a system that converting 2D shapes to 3D, but has different application. However, the algorithm used in this system is not very complex, that is why it is quick and support real time interaction with users.

Reading #21 Teddy: A Sketching Interface for 3D Freeform Design

Summary:
A Sketching Interface is introduced in this paper named Teddy. The system can automatically convert the 2D stroke drawn by user into 3D polygonal surface. The process of is really convinient for users, which is showed in their user study that creating models is not hard even for first time users. A standard polygonal mesh is used to represnet the 3D picutre. The experiment shows the system is robust and can finish the task in short time.
Discussion:
It is interesting that converting 2D to 3D can be supported. It is also an efficient way to deal with 3D models if we need just draw a 2D shapes, which is easier. It is very useful that users can see the result while he or she draw the 2D shape, which allow them to modify it quickly.

#Reading 20 MathPad2

Summary:


This paper introduces a mathmatics problem solving system which is pen-based and modeless. the system can recognize the equation and solve math equation users draw by hand by converting them into MATLAB. This paper use a gesture-based method to help the recognition of mathmatic expressions.Some useful gestures are developed to support the funtion of editting and commandin.Nailing Diagram Components and Grouping Diagram Components are used to make a diagram.Two kind of association are included : explicit association and implicit association.

Discussion:
In order to implement the function of this system, character recogniton is needed, which is not easy especially when the character set is huge. If its accuracy is good enough, it will be really helpful and be a natural way to use MATLAB with powerful computational ability.

Reading #19 Diagram Structure Recognition by Bayesian Conditional Random Fields

Summary:
In this paper, a recognizer is built on Bayesian conditianl random field. This recognizer doesn't analysis each segment individually, but jointltly anaylizes all elements together. This paper introduced about the principle of Bayesian conditianl random field and talk about how to incorporate Automatic relevance Determination into it. It also introduce an application to ink classification. The result shows the accuracy is good when use this paper's method.

Disccusion:
The paper shows the importance of context information in SketchRecognition in order to achieave a high accuracy.
This paper use baysian theory to incorpate this context information into their recogniton result. However, it is not easy to use Bayesian conditianl random field in sketch recognition.

#18 Spatial Recogntion and Grouping of Text and Graphics

Summary:

This paper introduces a method to recognize and group text and graphics. It only uses spatial information to do this work.  Basically, it is a search method to find the best grouping. First of all, the authors construct a neighborhood graph that connects close vertex. They also restrict the size of each subset to be <= k.   Since the size of the searching space is combinatorial in the number of the vertex, A* is used to prune away less possible branches. A recognizer is then built using boosted decision trees.

Discussion:

The idea of using graph is similar with the graph-based recognizer.  Also, choosing heuristic function is important for A* searching method.

Reading #17 Distinguishing Text From Graphics in On-line HandWritten Ink

Comments on:

Summary:

This paper is about distinguishing test from graphics. It proposes three methods: (1) the first one treat each stroke separately. It extracts features from the stroke and use feed-forward neural network to train the weight of each input. (2) The second one adds information of temporal context to improve the performance of the first method. The intuition behind this is that users usually draw or write multiple strokes successively. That means the stroke is more like to be followed by a stroke with the same. An HMM can be constructed to represent the sequence of the stroke. (3) The third method uses the gaps features to train the neural network in the first method and combine it with a bipartite HMM in the second method.


Discussion:

The way of distinguishing test from shape is more complex than the one that use entropy. I think entropy or curvature is more intuitive. However, it is a good idea to use the context information. I hope to find or see more different ways to solve this kind of problem since right now I am satisfy with none of them. 

Reading #16 graph-based symbol recognizer

Comment on:
Summary
This paper introduced a graph-based symbol recognize. Basically, it is still a template matching method, however, the template is not the stroke itself, but the attribute relation graph (ARG )of it.
The recognizer treat each symbol as an ARG  that represent the symbol’s geometry and topology. For example, an ideal square can be represented as below

This paper use six error metric to generate the dissimilarity score of the candidates with the templates: Primitive Count Error, Primitive Type Error, Relative Length Error, Number of Intersections Error, Intersection Angle Error and Intersection Location Error. Then, the recognizer use four different matching method to calculate the similarity between graphs: Stochastic Matching, Error-Driven matching, Greedy Matching and Sort Matching.
The result shows all four graph matching method has more than 92% accuracy with different recognition speed.
Discussion:
The critical problem of graph-based recognizer is how to construct accurate ARG.  But the idea of generating ARG for symbol is really interesting, which can make the recognizer rotation and scaling. Also, the comparison between different algorithms is also useful  when choose algorithm for different situation.

Reading #15 An Image-Based Trainable Symbol Recognizer

Comments on:

Summary
In this paper, the authors aim to develop a trainable symbol recognizer. Since this recognizer only needs one template to do the template matching, it allows users to easily train the system by adding, removing or overwriting.
In this paper, four methods are used to do the template matching: Hausdorff Distance, Modified Hausodff Distance, Tanimoto Similarity Coefficient and Yule Coefficient. Then the recognizer combines these four methods with the help of parallelization and normalization.
To deal with the problem of rotation, the authors transform the symbol in Cartesian coordinates into polar coordinates. During the process, the recognizer throws away the definition that is obviously dissimilar with to the candidates.
This paper conducts four tests, which differ in the number of definition and whether independent from users. The accuracy is above 90%.
Discussion:
This paper combines four different methods to evaluate the similarity between templates and candidates. By doing this, the recognizer result in a satisfactory accuracy rate. The method of dealing with rotation is different from what is done in $1. Since $1 rotate candidate while here, the templates are rotated to match the candidate.

2010年12月12日星期日

Reading #12 Constellation Models for Sketch Recognition


Summary
This paper uses constellation model to capture structure of particular class of objects. Their algorithm try to assign labels to strokes according to the likelihood computed.
The constellation model in this paper classifies each label as mandatory or optional. They compute individual features for each part, but only compute pair wise features for mandatory parts to reduce calculation. The likelihood of labeling is computed and maximum likelihood search is used to find the best labeling for all strokes.  The authors also use multi pass threshold and hard constraints to avoid spend too much time on the process of maximum likelihood search.
Discussion
As the authors said, their system has a big limitation that each stroke is required to have a label. This requirement is similar with gesture recognition that asks users to draw a gesture using only one stroke. In addition, this paper doesn’t talk about its recognition accuracy rate and use too little test data, which makes its conclusion not very persuasive.Even though,using constellation models in sketch recognition is a different idea which may be helpful in recognize certain type of shapes.

Reading #13 Ink Features for diagram recognition


Summary
 This paper aims to analysis the importance of many ink features and uses them to distinguish test against shapes. It first review some techniques used in the area of sketch recognition: bottom-up approaches like Rubine’s and template matching, top-down and combination of these two kinds of approaches.
Then, this paper selects forty-six features and evaluates their significance based on formal statistical analysis. They use statistical partitioning techniques to generate a decision tree as below

This paper then uses these features to build a text/shape divider and compare it with other two dividers. The result show it has an improvement both on classify shape and text.
Discussion
Distinguish text from shape is very important in many situation. In addition, it is very interesting. This paper finds out 8 most useful features to distinguish text from shape. However, in the paper of Doctor Hammond, only the feature of entropy is used and result is also good. The divider introduced in this paper of defining some threshold is similar with what I did in project two.

Reading #10 Graphical Input through machine recognition of sketchs


Comments on:
Summary:
This paper introduces three experiments in computer processing of sketches and discusses the computer’s help in the phase of design rather than just in computer-aid evaluation.
The first experiment is HUNCH, a set of programs dealing with freehand drawings with some inference of user’s intent. HUNCH find corners by assuming speed decreases at corners. It “latched” endpoints that fall within a fixed radius of each other and turned overtraced lines into one.  HUNCH also uses some other inference, like turning 2D networks into 3D structures.
The second and third experiment use information about context to develop context-based programs and interact with users by allowing users to modify interpretation.
Discussion:
The corner finding method in this paper is similar with a seign’s method of using speed graph, but was introduced nearly 30 years ago. This paper discuss the way to involve compute into design process and suggest ways that computer interact with users.
Some great ideas in this paper certainly inspire many recently works in sketch recognition, even though details in this paper is not much.

2010年12月10日星期五

Reading #11 Ladder

Reading #11 Ladder

Comment on:

Kim 


Summary:
By combining primitives and assigning constraints to them, Ladder allows users to describe higher level Symbols. The users are asked to only write the description of the domain the system is being used.  This description includes how shapes are drawn, displayed and edited in that domain. This description are limited to shapes with fixed graphical grammar, composed by primitive constrains already in Ladder. Ladder prefers shapes with high regularity and fewer details.

The shape definition includes components, constraints, alias, editing behavior and display method. According to these parts, the description can be translated into shape recognizers.


Discussion:
it is very interesting to generate recognizer for different domain by just defining the description and constraint of that domain. While,the user needs to be familiar themselves with this language and the limited number of primitives may constrict the user.

Reading #9 PaleoSketch: Accurate Primitive Sketch Recognition

Reading #9 PaleoSketch: Accurate Primitive Sketch Recognition
Comment on:
Summary:
PaleoSketch is a low-level (primitive) recognizer, which can deal with 8 basic shapes (line, polyline, circle, ellipse, arc, curve, spiral and Helix) with recognition more than 98%. In addition, it doesn’t ask users to always draw in certain way and leave them more freedom.
This recognizer first use pre-recognition to do some preparation for future shape test. It includes removing consecutive, generating some graphs and values, computing NDDE and DCR (which are useful for distinguishing polyline and arcs) and removing tails. In addition, the stroke is also tested for whether being over traced and closed.
Then, PaleoSketch conducting test for each basic shape. These tests are based on the feature of the shapes and some manually defined thresholds. A complex test is also used to deal with strokes that are better defined as the combination of more than 1 primitive.  The recognizer use a ranking algorithm to order all the interpretation and choose the best fit from them.
Discussion:
PaleoSketch analysis the characteristic of each shape and computing some useful features to evaluate them, and its results are important for high-level recognizers. For each stroke, it uses all the tests and then finds the best fit from them. This could be time consuming. If the test itself can generate an absolute confidence of being this kind of shape, maybe we can avoid doing other tests anymore.

2010年12月9日星期四

# 8 $N: A Multistroke Recognizer


# 8 $N: A Multistroke Recognizer
Comments:
Summary:
$N is an extension of $1 from the same author. $ N algorithm solves many limitations of $1. Below are some new functions of $N:
(1)   Being able to recognize multistroke gestures, also, this algorithm automatically generates all possible combination of those strokes, which is very convenient for users.
(2)   $N can distinct gestures that only different on their orientations. This is done by using bounded rotation invariance instead of full rotation invariance.
(3)   Being able to distinguish 1D gestures like lines.  $N firstly differentiates 1D from 2D by computing their ratio of sides of their oriented bounding box. Obviously, this ratio of 1D gesture is lower than 2D’s.
(4)   $N also use start angles to decrease the number of comparisons. In addition, it also speeds up the algorithm by comparing only with templates with the same number of strokes.
Discussion:
$N is still very simple for implementing. Even though it announce can deal with multistroke, actually it still deal with them as one stroke gestures.  That is why it has difficulty to distinguish “=” with z. Also, if the number of strokes increased, the speed of this algorithm will be decreased. I am interested about combine some features of Protractor with $N.

Reading #4: Skethpad

#4 Skethpad: A HCI system more than 50 years ago
Comments on:
Summary:
Sketchpad is the first pen-based sketching system. As an example, the author introduces steps to draw a combination of hexagons with the help of circle and defining the relation between them. The author also argues that the construction of a drawing with Sketchpad is actually a model of designing process.
Then, this paper talks about the ring structure of the system. Then the author focus on introducing the design of some main part of the system, like light pen, how to display drawings, the mathematics model of basic shapes and the algorithm of editing shapes, etc.
At the last, this paper shows some interesting application of Sketchpad, such as patterns, bridges and artistic drawings.
Discussion:
Many ideas introduced in this paper are breakthrough at that time. Even though, compare with many devices nowadays, Sketchpad is simply a drawing pad that is not convenient for users, the ideas behind it are the foundation of today’s HCI designing and even for Object-oriented programming.