• DocumentCode
    555145
  • Title

    Quantitative evaluation of writing styles based on text analysis: Methods and case study

  • Author

    Jingmei Zhang ; Guangzhou Zeng ; Jingxiang Zhang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
  • Volume
    1
  • fYear
    2011
  • fDate
    20-22 Aug. 2011
  • Firstpage
    181
  • Lastpage
    185
  • Abstract
    Mathematical metrics indexing writing style of literature works based on a series of text classification techniques are introduced in this paper. Four different Chinese translation versions of the classical masterpiece of Maupassant´s Boule de Suif (Ball of Fat) are adopted as a case study to illustrate the inherent popularity, conformity and unique stylistic choices of translation language by different translators. Character frequency entropy (CFE) developed from modified Zipf-Mandelbrot principle is used here to evaluate the inherent popularity. The diction of phrasal materials and their clustering indices are then scrutinized with a critical parser of Chinese Word Segmentation (CWS) to evaluate writer´s conformity to conventional language. Sentence length and dispersion are calculated to reveal the habit of a loose or a compact syntax. The full analysis of sample texts from zi (character), ci (word) to ju (sentence) demonstrates a panorama of linguistic style of translators involved.
  • Keywords
    natural language processing; pattern classification; text analysis; word processing; Ball of Fat; Character frequency entropy; Chinese translation versions; Chinese word segmentation; Maupassants Boule de Suif; character analysis; conventional language; linguistic style; mathematical metrics; modified Zipf-Mandelbrot principle; phrasal materials; quantitative evaluation; sentence analysis; text analysis; text classification techniques; translation language; word analysis; writers conformity evaluation; writing styles; Educational institutions; Entropy; Materials; Testing; Text categorization; Writing; Chinese Word Segmentation; Sentence indexing; Text analysis; Word frequency entropy; Writing style;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Artificial Intelligence Conference (ITAIC), 2011 6th IEEE Joint International
  • Conference_Location
    Chongqing
  • Print_ISBN
    978-1-4244-8622-9
  • Type

    conf

  • DOI
    10.1109/ITAIC.2011.6030181
  • Filename
    6030181