0% found this document useful (0 votes)
25 views14 pages

WakaharaT92a_On-line_handwriting_recognition

Paper on handwriting recogintion on a tablet/touchscreen.

Uploaded by

ektawhatsis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
25 views14 pages

WakaharaT92a_On-line_handwriting_recognition

Paper on handwriting recogintion on a tablet/touchscreen.

Uploaded by

ektawhatsis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 14
On-Line Handwriting Recognition TORU WAKAHARA, MEMBER, IEEE, HIROSHI MURASE, MEMBER, IEEE, AND KAZUMI ODAKA, MEMBER, IEEE Invited Paper Ontine handwriting recognition means that a machine recog izes each character as it 18 being written. For large-alphabet languages, like Japanese, handvriting input using an on-line ‘recognition technique is essential for input accuracy and speed. offers several advantages over offline handwriting recogmtion However, there are serious problems that preven high recognition ‘accuracy without imposing handwriting constraints. Fist, the thousands of ideographic Japanese chsaracters of Chinese origin (called Kanji) can be writen with wide variations in the number ‘and order of strokes and significant shape distortions. Als, writing bax-free recognition of characters is required to create 4 beter ‘man-machine interface. This paper describes the intense research ‘performed by NTT over the past 15 years to answer the most ‘pressing recognition problems. Prototype systems developed by NIT are also described. Last, the man-machine interfaces made possible with onvline handwriting recognition and anticipaied ‘advances in both hardware and sofware are discussed. Keywords—user friendly input to computers, online handvrit ing recognition, Kanji character recognition, character sequence recognition, tablet digitzers. 1. INTRopuCTION For preparing a first draft and concentrating on content creation, pencil and paper are often superior to keyboard entry. Handwriting recognition allows the creativity of handwriting to be combined with the advantages of word processors. For large-alphabet languages, like Japanese ‘or Chinese, keyboards are particularly cumbersome and ficient. If large-alphabet languages could be input via on-line handwritten character recognition, more effective ‘man-machine interfaces could be created. For example, the user can easily detect and correct misrecognized charac- ters on the spot by verifying the recognition results as they appear. Hence, on-line handwriting recognition for transcription is being intensively investigated by mainly Japanese or Chinese scientists. Chinese characters or Kanji in Japan consist of many strokes and many complex com- ‘pounds need to be distinguished. One character in the block ‘Manuscript received Feb, 21,1991; revised Oct. 30,1991 7. Wakahars ss with NTT Human Interface Labs, Kanagawa 238-03, sepa, 'H. Murase and K. Odaka are with NTT Basic Research Labs, Tokyo 180, Japan TEEE Log Number 9202478 style has an average of 8-10 strokes, the simplest character hhaving one stroke with the most complicated having more than 30. Most people do not strictly follow the correct order of strokes when writing a character. Moreover, a cursive style is preferred when writing quickly. Fewer strokes are used and the radical shapes are often deformed and simplified. To resolve the above problems, NTT proposed a series of robust recognition algorithms based on a stroke by stroke matching strategy composed of two steps. The fist step is stroke correspondence determination, which is indis- pensable for realizing stroke number and stroke order free recognition. The second step is the calculation of interchar- acter distances between a large number of templates and distorted input pattern to realize distortion-tolerant pattern ‘matching. This strategy was proven to be very powerful by extensive recognition tests. Morcover, for writing box-free recognition of character sequences, an enhanced algorithm \was proposed that selected the optimal segmentation among, all segmentation possibilities advanced by the recognition process. This algorithm was also successfully applied 10 free-format line figure recognition, Together with research into recognition algorithms, prototype systems were devel- oped to create a user friendly Japanese input device using on-line handwriting recognition. Other important uses of ine handwriting recognition are editing, annotating, and other applications that are heavily interactive or that use direct pointing and manipulation. An example of this is @ gesture interface, an area of substantial recent interest. We discuss the future man-machine interfaces made possible with on-line handwriting recognition as well as anticipated advances in both hardware and software. T ON-LINE VERSUS OFF-LINE RECOGNITION Off-line handwriting recognition is performed after the ‘writing is completed. An optical scanner converts the image of the writing into a bit pattern. This technique cannot use any dynamic writing information: the number of strokes, the order of strokes, and the direction of stroke creation. This makes it difficult to extract and identify the strokes of each character. The best off-line recognition algorithms (0018-9219992803.00 © 1992 IEEE PROCEEDINGS OF THE IEEE, VOL. th NO. 7, ULY 192 use pattern matching to extract static shape features with coordinates and so match the image as a feature array. On-line recognition, by contrast, means that the machine recognizes each character as it is being written. The pre- ferred input device is an electronic tablet with a stylus pen. The electronic tablet captures the zy coordinate data of pen-tip movement, which typically has a resolution of 10 pointsimm, a sampling rate of 100 poinis/s, and an indication of pen-up and pen-down. That is, the trace of the handwriting or line drawing is captured as a sequence ‘of coordinate points with separate stroke indications On-line recognition has two major technological advan. tages over off-line recognition. The fist is high-recognition accuracy. This is because on-line recognition captures a character as a set of strokes which are represented a series of coordinate points while off-line recognition only has a bit pattern and shape feature extraction is mostly based on heuristics. This advantage becomes conspicuous ‘when dealing with strongly distorted characters. written in the cursive style, The other advantage is interaction. In on-line recognition, it is very natural for the user to detect and correct misrecognized characters on the spot by verifying the recognition results as they appear. The user is encouraged to modify his writing style so as to improve recognition accuracy. Also, a machine can be trained to @ particular user’s style. Samples of his misrecognized characters are stored to aid subsequent recognition. Thus both writer adaptation and machine adaptation is possible, Moreover, editing, annotating, and other applications that use direct pointing and manipulation are well suited to on-line handwriting recognition. TIL. HisToRIcaL OvERvIEW Tablet digitizers have existed for over three decades. The earliest prototype for on-line handwriting recognition was T. L. Dimond’s *stylator” (1) in 1957, The RAND tablet [2] was clearly the most popular of the early digitzers and spurred intense activity in on-line handwriting recognition in the 1960's. Although the recognition objects were lim- ited to English alphanumerics, basic and important ideas for recognition algorithms and system components were proposed in rapid succession. One technique discriminates the shape of strokes by examining the time sequence of direction angles of the pen-tip movement. This is a variant of the pattern matching approach. Another technique used the structure analyzing approach to define a set of shape primitives and represent each character as a primitive-code sequence. Also, fundamental techniques such as preprocess- ing which includes noise reduction, smoothing, filtering, and size normalization were introduced [3]. These opening. studies determined the fundamental direction of subsequent research into on-line handwriting recognition, However, conly the early stage of computer simulations was reached and recognition accuracy was not confirmed by prototype testing. In the 1970's, the intense activity in on-line charac- ter recognition ebbed in America, although there existed several attempts to realize automatic signature verification [4]. In Japan, by contrast, research into on-line recogni- tion of handwritten characters started at the end of the 1960's and reached a high level in the 1970's as the most hopeful Japanese text input method. Japanese write with Hiragana, Katakana, Kanji, English alphanumeries, and symbols. Hiragana and Katakana (called Kana) are phonetic alphabets, and each has 46 full-size characters. Eight Kana characters can be written half size and, together with additional markings, indicate subtle phonetic differences. Kanji are ideographic characters of Chinese origin, and the ‘most common Japanese Industry Standard contains 6349 characters. Keyboard input of Kanji is very cumbersome even for professional users. There are two serious problems in on-line Kanji recognition. One is that 2 large number of characters exist to be recognized. The other is the wide variations in handwritten characters, ie., sttoke number, stroke order, and shape. Researchers began by tackling the first property in that they assume that each Kanji character was accurately written in the block style with the correct stroke number and order. Initially, most research took the structure analyzing approach that recognized primitive strokes of radicals to idemtify a whole Kanji character. However, at the end of the 1970's recognition accuracy ‘remained quite low because the primitive stroke recognition technique was not sufficiently robust. The hierarchical recognition strategy was perceived to be a failure. Remarkable advances in software techniques for Kanji character recognition algorithms were made in the 1980's, ‘The structure analyzing approach was enhanced by pro- viding about 100 primitive strokes to permit constituent shape distortion and improve stroke recognition accuracy. However, the stroke number and stroke order variation permission were not successfully realized for lack of an efficient systematic search technique having no combi- natorial exhaustion. With regard to the pattern matching approach, considerable efforts were also made to devise a robust stroke matching measure, and permit wider varia- tions in stroke number and stroke order. NTT proposed a series of robust algorithms based on the pattern matching, approach and almost completely resolved the previously mentioned problems. The other was the renewed interest in America and Europe for user-interfaces designed around ‘on-line handwriting recognition. An example of this is the handdrawn gestural interface [5], where “gesture” means ‘erase, copy, move, insert, sum a row of numbers, and so con. A significant hardware advance was the combining of input tablets and flat displays to bring input and feedback ‘onto the same surface. We can now clarify the present status and future problems of on-line handwriting recognition techniques and develop a clear idea of future user friendly interfaces built around on-line handwriting recognition For readers who want to know more about the state of the art in on-line handwriting recognition, the exhaustive survey paper (7] is recommended. There is also a survey [8] solely devoted to the on-line recognition of handwritten Japanese characters. PROCEEDINGS OF THE IEEE, VOL 0, NO. 7, JULY 1992 TV. KANsI CHARACTER RECOGNITION (Chinese characters or Kanji in Japan can consist of many strokes. The stroke number of a character in the block style ranges from one for the simplest character to over 30 for the most complicated. Also, the correct stroke order in a character is not strictly observed by most people. Moreover, the cursive style is written faster and with more deformed strokes, Therefore, in on-line recognition of Kanji characters we have to overcome two serious problems: a large character set and wide variations in handwriting ‘properties, To resolve these problems, we proposed a series of robust recognition algorithms based on the stroke by stroke matching strategy taken from the pattem matching approach, We adopted the pattem matching approach in- stead of the structure analyzing approach. The first reason is that it is much easier to recognize a whole character than to recognize its parts or primitives if it is considerably deformed. The second reason is that it is much more reliable to provide a prototype or prototypes for each Kanji character than to devise ad hoc primitive strokes or radicals when flexible adjustment to an increase or decrease in the ‘number of Kanji characters to be recognized is needed, We provided prototypes for all Kanji characters using feature point representation. Here, “prototype” means a template generated by averaging x-y coordinate values of feature points over learning samples for each character. Then, we divided the stroke by stroke matching strategy into the following two steps. The firs step is stroke correspondence determination between a template and an input pattern. This is indispensable for realizing stroke number and stroke ‘order free recognition. The second step is intercharacter distance calculation using all templates and the input pattern to yield distortion-tolerant pattern matching. In the following, we describe the five kinds of recogni- tion algorithms in the order of their proposal. The order corresponds to the progress made in relaxing handwriting constraints with regard to stroke number, stroke order, and permissible shape distortion ‘A. Simple Stroke Matching Method In this subsection, we deal with the problem of on-line recognition of Kanji written in the block style with the comrect stroke number and stroke order. Therefore, only the templates that contain the same number of strokes as the input pattern are candidates for matching. Moreover, as the stroke order of the input pattern is correct, we have only to perform stroke by stroke matching in the order of handwriting between the input pattern and the template being considered. As a result, the problem resolves itself into how to define the interstroke distance. ‘The proposed method [9] approximates cach stroke by 1 fixed number of feature points and represents the stroke by means of z-y coordinate values of these feature points. Next, the interstroke distance is defined as the sum of interpoint Euclidean distances. Let d; denote the interstroke distance between the ith stroke of the template and the ith stroke of the input pattern. Then, the intercharacter distance \WAKAHARA cal: ON-LINE HANDWRITING RECOGNITION 2 a . Fig. 1. Feature point representation of Kanji characters D is defined as: L Ya Q Let N denote the number of strokes common to the template and the input pattern. Then, the category of the template that realizes the minimum intercharacter distance D among all templates with NV strokes is the recognition result of the input pattern. Figure 1 shows examples of Kanji characters in the block style, where the black circles denote the loci of the feature points. From Figure 1, it is easy t0 sce that most Kanji strokes ate straight lines. In fact, even if a stroke has bending poinis, there is usually only one. Only three feature points per stroke (two end points and a middle point) as shown in Figure 1 were found to be necessary 10 Kanji stroke discrimination. In contrast, six feature points were needed to represent each stroke of Hiragana because Hiragana mainly uses curved strokes. In general, a large ‘number of feature points can represent a curved stroke with high fidelity. However, excessive attention to handwriting distortion on curved portions is counterproductive. Obvi- ously, there exists an optimal number of feature points for each character set considered that maximizes recognition ability Recognition tests were made using a mixed set of 881 Kanji characters, 46 Hiragana characters, and English al- phanumerics. We obtained the high recognition rates of 99.8% for learning samples and 99.7% for test samples. ‘The character data used in the tests, however, consisted cf samples carefully handwritten in the block style with the correct stroke mumber and stroke order. The strict ‘observance of cortect stroke number and stroke order was an excessive burden on users in actual applications. ‘Therefore, the algorithm described in this subsection must be considered as the starting point of subsequent research ‘which attempted to relax the handwriting constraints B. Imterstroke Distance Matrix Method In this subsection, we describe a stroke-order free on- line Kanji recognition algorithm, which was the first step in relaxing handwriting constraints. The algorithm ignored information about the stroke order of the input pattern. Namely, a character is considered as not an ordered set of strokes but a nonordered set of strokes. Therefore, the essential problem is to determine appropriate stroke ‘correspondences between the templates and the input pat- tein without stroke order information. The key idea to resolve the above problem is that the most appropriate stroke correspondence should produce the minimum sum of Min D0) Fig. 2. Inerszoke distance matrix (ISM) interstroke distances. Based om this key idea, we proposed the inerstoke distance matrix method [10] “The proposed method uses the stoke by sroke matching strategy and is 2 simple but essential extension of the method desribed in Section IV-A. Fist, with respect to tach siroke in a template, the interstroke distance to all stokes inthe input pattern i calculate, Hee etd dente the inerstroke distance between the ith stroke of the tem- plate andthe jth stroke of the input pater, where the stoke humbers of the template and the input pattem are equal to N. Collectively the N x 1 interswoke distance matrix ISDM whose (3) element equals d, can be defined. Next, we search fr the minimum value in each row of ISDM That i, the template stroke to input stoke corespondence that proces the minimum inesroke distance is chosen This allows complete freedom in sroke order. Last, the intercharacter distance D between the template and the input patter is calculated as: D=Tominds, ® where it is to be noted that the stroke correspondence determined by (2) is not necessarily 1-to-1. However, this causes no serious problem in recognition accuracy when the correct stroke number condition is valid. Figure 2 shows an example of ISDM between two patterns of AS (means “tree”) writen with diferent stroke orders. Tests were made using two kinds of character data ses. ‘The frst set was the samples used in Section IV-A, correct stroke number and correct stroke order. For t character data set the ISDM method achieved much the ssame recognition rate as that obtained by the simple stroke matching method. The second character data set consisted ‘of new samples of 1851 Kanji characters and 76 Hiragena characters in the block style with the correct stroke number and arbitrary stroke order. The ISDM method achieved the high recognition rate of 99.5% for the second s Test results indicate that there are two reasons for the high recognition ability of the ISDM method. One is that the stroke by stroke matching strategy is so powerful a that the stroke order information can be ignored. The other reason is that the correct stroke number condition plays a major role in excluding most simitarly shaped, but different characters, from the recognition candidates. On the other hand, misrecognition mostly results from shape distortion for an input pattern composed of a small number of strokes. A complicated Kanji character composed of a large number of strokes is rather easy to recognize because of the considerable redundancy in shape. At any rate, if the correct stroke number condition is violated, the stroke by stroke matching strategy is in danger of losing its high recognition ability. Relaxing the stroke number restriction allows significantly more handwriting distortion. In subsequent subsections, we propose and improve stroke number free recognition algorithms based on the pattern ‘matching approach by incorporating structural information. . Selective Stroke Linkage Method In this subsection, we describe an on-line character recog- nition algorithm that handles the stoke number and stroke order variations common in natural Japanese handwriting formed in the block style. Tn order to enhance the stroke by stroke matching strategy by permitting both spitting of a single stroke and concate- nation of successive strokes, itis necessary to resolve the following proble 1) whole-part and distortion-olerant stroke matching; 2) robust stroke correspondence determination. ‘To resolve the first problem, we changed the feature point representation ofa stroke so that a stroke was approximated by feature points at regular intervals rather than by a fixed number of points. Thus the number of feature points is nearly proportional to stroke length. This allows us to introduce two kinds of interstroke distances. One for whole- part matching calculates the sum of interpoint Euclidean distances in order for the whole of the shorter stroke to be matched with the front part of the longer stroke. The other for distortion-tolerant matching calculates the sum of interpoint Euclidean distances by using dynamic programming (DP) matching. ‘Next, to resolve the second problem, we proposed the selective stroke linkage method [11]. Let M be the larger stroke number of the input pattern and the template and let N be the smaller stroke number, i., M > NN. Also, let PM be a name for the pattern with M strokes and let PN bbe a name forthe pattern with NV strokes. If PM isthe input patter, PN is the template, and vice versa. The procedure for determining stroke correspondences is divided into two steps. The first step determines N stroke pais, i., 110-1 ‘correspondence between IV strokes in PM and NV strokes in PN, that minimize the sum of interstroke distances using whole-part stroke matching. This step allows for stroke order variations. Figure 3 shows an example of stroke correspondence determination for two patterns of = (means “character”), that have different stroke numbers and PROCEEDINGS OF THE IEEE, VOL #9, NO. 7, JULY 1992

You might also like