• DocumentCode
    2030840
  • Title

    Verifying the UNIPEN devset

  • Author

    Vuurpijl, Louis ; Niels, Ralph ; Van Erp, Merijn ; Schomaker, Lambert ; Ratzlaff, Eugene

  • Author_Institution
    Nijmegen Inst. for Cognition & Information, Netherlands
  • fYear
    2004
  • fDate
    26-29 Oct. 2004
  • Firstpage
    586
  • Lastpage
    591
  • Abstract
    This paper describes a semi-automated procedure for the verification of a large human-labeled data set containing online handwriting. A number of classifiers trained on the UNIPEN "trainset" is employed for detecting anomalies in the labels of the UNIPEN "devset". Multiple classifiers with different feature sets are used to increase the robustness of the automated procedure and to ensure that the number of false accepts is kept to a minimum. The rejected samples are manually categorized into four classes: (i) recoverable segmentation errors, (ii) incorrect (recoverable) labels, (iii) well-segmented but ambiguous cases and (iv) unrecoverable segments that should be removed. As a result of the verification procedure, a well-labeled data set is currently being generated, which will be made available to the handwriting recognition community.
  • Keywords
    handwriting recognition; image segmentation; UNIPEN devset; incorrect labels; large human-labeled data set verification; multiple classifiers; online handwriting; recoverable segmentation errors; semi-automated procedure; unrecoverable segments; well-labeled data set; Cognition; Collaboration; Conferences; Databases; Enterprise resource planning; Handwriting recognition; Labeling; NIST; Robustness; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition, 2004. IWFHR-9 2004. Ninth International Workshop on
  • ISSN
    1550-5235
  • Print_ISBN
    0-7695-2187-8
  • Type

    conf

  • DOI
    10.1109/IWFHR.2004.109
  • Filename
    1363975