Academic
Publications
Keyword Spotting in Poorly Printed Documents using Pseudo 2-D Hidden Markov Models

Keyword Spotting in Poorly Printed Documents using Pseudo 2-D Hidden Markov Models,10.1109/34.308482,IEEE Transactions on Pattern Analysis and Machine

Keyword Spotting in Poorly Printed Documents using Pseudo 2-D Hidden Markov Models   (Citations: 155)
BibTex | RIS | RefWorks Download
An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented. For each keyword, two statistical models, called pseudo 2-D hidden Markov models, are created for representing the actual keyword and all the other extraneous words, respectively. Dynamic programming is then used for matching an unknown input word with the two models and for making a maximum likelihood decision. Although the models are pseudo 2-D in the sense that they are not fully connected 2-D networks, they are shown to be general enough in characterizing printed words efficiently. These models facilitate a nice “elastic matching” property in both horizontal and vertical directions, which makes the recognizer not only independent of size and slant but also tolerant of highly deformed and noisy words. The system is evaluated on a synthetically created database that contains about 26000 words. Currently, the authors achieve a recognition accuracy of 99% when words in testing and training sets are of the same font size, and 96% when they are in different sizes. In the latter case, the conventional 1-D HMM achieves only a 70% accuracy rate
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence - PAMI , vol. 16, no. 8, pp. 842-848, 1994
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: