Academic
Publications
Incorporating linguistic post-processing into whole-book recognition

Incorporating linguistic post-processing into whole-book recognition,10.1117/12.839099,Pingping Xiu,Henry S. Baird

Incorporating linguistic post-processing into whole-book recognition   (Citations: 1)
BibTex | RIS | RefWorks Download
We describe a technique of linguistic post-processing of whole-book recognition results. Whole-book recognition is a technique that improves recognition of book images using fully automatic cross-entropy-based model adaptation. In previous published works, word recognition was performed on individual words separately, without awaring passage-level information such as word-occurrence frequencies. Therefore, some rare words in real texts may appear much more often in recognition results; vice versa. Differences between word frequencies in recognition results and in prior knowledge may indicate recognition errors on a long passage. In this paper, we propose a post-processing technique to enhance whole-book recognition results by minimizing differences between word frequencies in recognition results and prior word frequencies. This technique works better when operating on longer passages, and it drives the character error rate down 20% from 1.24% to 0.98% in a 90-page experiment.
Conference: Document Recognition and Retrieval - DRR , vol. 7534, pp. 1-10, 2010
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: