• DocumentCode
    3740157
  • Title

    Extraction and Analysis of Web Interviews

  • Author

    Philipp Berger;Patrick Hennig;Johannes Eschrig;Daniel Roeder;Christoph Meinel

  • Author_Institution
    Hasso-Plattner-Inst., Univ. of Potsdam, Potsdam, Germany
  • Volume
    1
  • fYear
    2015
  • Firstpage
    499
  • Lastpage
    504
  • Abstract
    The amount of newspaper and blog articles keeps growing and the analysis of these unstructured data gains importance as well in research and in the business environment. As special kind of articles we like to focus on interviews. In contrast to regular articles, interviews consist of two or more speakers with different viewpoints. We propose a semi-supervised approach to detect webpages containing interviews. Our experiments show a high f-measure of 77.3% for a manually annotated test set. To apply text and author analysis approaches one needs to separate the excerpts for different authors and recognize their names. We present an extraction method of speakers and their corresponding text excerpts e.g. questions and answers. Based on the extracted text structure, we introduce first measures to understand the interviewer and interviewee.
  • Keywords
    "Interviews","Feature extraction","Atmospheric measurements","Particle measurements","HTML","Blogs"
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM International Conference on
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2015.101
  • Filename
    7396854