Markov Random Field Based Text Identification from Annotated Machine Printed Documents
In this paper, we describe an approach to segment hand- written text, machine printed text and noise from annotated machine printed documents. Three categories of word level features are extracted. We use a modified K-Means clus- tering algorithm for classification followed by a relabeling procedure using Markov Random Field(MRF) based on a conceptofneighboringpatchesandBeliefPropagation(BP) rules. Experimental results on an imbalanceddata set show that our approach achieves an overall recall of 96.33% .