• DocumentCode
    1815135
  • Title

    Information loss in digital documents: Gurmukhi fonts perspective

  • Author

    Mahi, Gurjot Singh ; Bajwa, Kanwalpreet Singh ; Verma, Amandeep ; Singh, Gagandeep

  • Author_Institution
    Regional Center for Inf. Technol. & Manage., Punjabi Univ., Mohali, India
  • fYear
    2015
  • fDate
    6-8 Jan. 2015
  • Firstpage
    111
  • Lastpage
    115
  • Abstract
    Digital representation of Indic script in digital documents remain one of the prominent problems in recent era. The fonts have its effect in the continuation of digital documents and the need of fonts cannot be overemphasized in digital world, when one comes to preservation of Indie script documents for India digital libraries. Gurmukhi script is one of the most prominent scripts of India and is used to write Punjabi Language. Punjabi is one of the most widely spoken languages in the world with more than 2 million users only in India. Due to such large number of users, the digitalization of language becomes apparently useful. Earlier Gurmukhi fonts were developed using ASCII character encoding scheme which was basically meant for English language and not intended to provide direct support to foreign languages. Moreover Gurmukhi fonts were developed without following any mapping standardization which leads to development of numerous fonts with different mapping tables. As of the evolution of different mapping schemes for different fonts the substitution between two Gurmukhi fonts leads to information loss in documents. This paper makes an attempt to demonstrate and understand the loss of information because of Gurmukhi font substitution by another with an experimental setup. The outputs of the setup were analysed on parameters as percentage of information loss, overall accuracy and weighted kappa statistics.
  • Keywords
    digital libraries; document handling; statistical analysis; ASCII character; Gurmukhi font substitution; Gurmukhi fonts perspective; Gurmukhi script; India digital libraries; Punjabi language; digital Indic script representation; digital documents; information loss; mapping scheme; weighted kappa statistics; Accuracy; Libraries; Loss measurement; Market research; Matrices; Reliability; Standards; Digital; Fonts; Gurmukhi; Preservation; Substitution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Emerging Trends and Technologies in Libraries and Information Services (ETTLIS), 2015 4th International Symposium on
  • Conference_Location
    Noida
  • Print_ISBN
    978-1-4799-7999-8
  • Type

    conf

  • DOI
    10.1109/ETTLIS.2015.7048182
  • Filename
    7048182