Title :
Information theoretic analysis of postal address fields for automatic address interpretation
Author :
Srihari, Sargur N. ; Yang, Wen-jann ; Govindaraju, Venugopal
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Abstract :
This paper concerns a study of information content in postal address fields for automatic address interpretation. Information provided by a combination of address components and information interaction among components is characterized in terms of Shannon´s entropy. The efficiency of assignment strategies for determining a delivery point code can be compared by the propagation of uncertainty in address components. The quantity of redundancy between components can be computed from the information provided by these components. This information is useful in developing a strategy for selecting a useful component for recovering the value of an uncertain component. The uncertainty of a component based on another known component can be measured by conditional entropy. By ranking the uncertainty quantity, the effective processing flow for determining the value of a candidate component can be constructed
Keywords :
document image processing; uncertainty handling; visual databases; Shannon´s entropy; automatic address interpretation; conditional entropy; delivery point code; information content; information theoretic analysis; postal address fields; uncertainty; uncertainty quantity; Artificial intelligence; Costs; Electronic switching systems; Entropy; Handwriting recognition; Information analysis; Postal services; Read only memory; Statistical analysis; Uncertainty;
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
DOI :
10.1109/ICDAR.1999.791786