• Title of article

    Computational methods in authorship attribution

  • Author/Authors

    Moshe Koppel1، نويسنده , , Jonathan Schler1، نويسنده , , Shlomo Argamon2، نويسنده ,

  • Issue Information
    ماهنامه با شماره پیاپی سال 2009
  • Pages
    18
  • From page
    9
  • To page
    26
  • Abstract
    Statistical authorship attribution has a long history, culminating in the use of modern machine learning classification methods. Nevertheless, most of this work suffers from the limitation of assuming a small closed set of candidate authors and essentially unlimited training text for each. Real-life authorship attribution problems, however, typically fall short of this ideal. Thus, following detailed discussion of previous work, three scenarios are considered here for which solutions to the basic attribution problem are inadequate. In the first variant, the profiling problem, there is no candidate set at all; in this case, the challenge is to provide as much demographic or psychological information as possible about the author. In the second variant, the needle-in-a-haystack problem, there are many thousands of candidates for each of whom we might have a very limited writing sample. In the third variant, the verification problem, there is no closed candidate set but there is one suspect; in this case, the challenge is to determine if the suspect is or is not the author. For each variant, it is shown how machine learning methods can be adapted to handle the special challenges of that variant.
  • Journal title
    Journal of the American Society for Information Science and Technology
  • Serial Year
    2009
  • Journal title
    Journal of the American Society for Information Science and Technology
  • Record number

    993885