In this paper a line segmentation technique is presented. It can be used for the extraction of handwritten lines from boxes or full pages. There is no assumption on the character sizes, on the number of lines or on the gap between lines. The methodology is based on a pipeline architecture made of three stages. Each stage is composed of a coarse line segmenter and a fine line follower to correctly separate the intruding characters into adjacent lines. The coarse line segmenter is based on the idea of dividing each line into several columns and on the use of local horizontal projection histograms. The fine line follower uses character contour information. The three stages employ slightly different methods to improve the overall quality. The lack of writing size knowledge is overcome by iterating the segmentation and detecting the error conditions. This line segmentation technique has been applied to the Constitution boxes of NIST SDB 1 and to the multi-line boxes of NIST SDB 11-13. The line segmentation success is 99.72%, but all the remaining 0.28% incorrectly processed images are detected and they may be signaled to the outer document processing system.
|Microelectronics Group Home Page||Staff|
|Research Activities||Teaching Activity|
|DEIS Home Page||University of Bologna Home Page|