• DocumentCode
    3717315
  • Title

    The coding of literary form: Data mining and the information structure of historical texts

  • Author

    Dallas Liddle

  • Author_Institution
    Department of English, Augsburg College, Minneapolis, Minnesota
  • fYear
    2015
  • Firstpage
    1661
  • Lastpage
    1666
  • Abstract
    This working paper argues that many data-mining projects in the humanities limit themselves by choosing words as their default unit of analysis. Some authors, problems, and forms are better illuminated by analysis of individual textual symbols, others by examination of multiword constructions. Insights about the nature of code from mathematical information theory, long but perhaps prematurely rejected by humanists on theoretical grounds, may give researchers less subjective and more powerful tools by which to measure the information characteristics of texts and the innovations of specific historical writers.
  • Keywords
    "Computers","Volume measurement","Databases","Scholarships","Big data","Data mining","Information theory"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7363936
  • Filename
    7363936