An expression in closed form is derived for the recognition error vs. rejection rate of optical character or word recognition systems. This expression explains,for the first time, the qualitative relationship between the raw error rate at zero reject and the behavior of error vs. rejection curves of several isolated character classifiers. It also allows to define a lower bound for the error rate of any recognition system employing a rejection process based on the definition of a confidence threshold.This relation has also proved to be useful to make a quantitative comparison between two confidence computation methods implemented in a system for reading USA Census '90 hand-written forms. The newly proposed method is based upon a confidence model integrating single character confidence levels, digram statistics and other information from the dictionary matching phase. Two implementations of this methodology have been investigated: the first is a linear model, the second is a multi-layer perceptron. Experimental results on the NIST Special Database 12 and 13 have not shown a clear advantage of the connectionist model over the simple linear model. At a $50\%$ rejection rate, the field error rate calculated using the new confidence computation algorithm decreased from $47.7\%$ to $44.6\%$, which represents a considerable improvement, given a theoretical lower bound of $40.8\%$ on the error rate.
|Microelectronics Group Home Page||Staff|
|Research Activities||Teaching Activity|
|DEIS Home Page||University of Bologna Home Page|