• DocumentCode
    2330290
  • Title

    Collection of user judgments on spoken dialog system with crowdsourcing

  • Author

    Yang, Zhaojun ; Li, Baichuan ; Zhu, Yi ; King, Irwin ; Levow, Gina ; Meng, Helen

  • Author_Institution
    Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China
  • fYear
    2010
  • fDate
    12-15 Dec. 2010
  • Firstpage
    277
  • Lastpage
    282
  • Abstract
    This paper presents an initial attempt at the use of crowd-sourcing for collection of user judgments on spoken dialog systems (SDSs). This is implemented on Amazon Mechanical Turk (MTurk), where a Requester can design a human intelligence task (HIT) to be performed by a large number of Workers efficiently and cost-effectively. We describe a design methodology for two types of HITs - the first targets at fast rating of a large number of dialogs regarding some dimensions of the SDS´s performance and the second aims to assess the reliability of Workers on MTurk through the variability in ratings across different Workers. A set of approval rules are also designed to control the quality of ratings from MTurk. At the end of the collection work, user judgments for about 8,000 dialogs rated by around 700Workers are collected in 45 days. We observe reasonable consistency between the manual MTurk ratings and an automatic categorization of dialogs in terms of task completion, which partially verifies the reliability of the approved ratings from MTurk. From the second type of HITs, we also observe moderate inter-rater agreement for ratings in task completion which provides support for the utilization of MTurk as a judgments collection platform. Further research on the exploration of SDS evaluation models could be developed based on the collected corpus.
  • Keywords
    interactive systems; pattern classification; user interfaces; Amazon mechanical turk; MTurk judgment collection platform; crowdsourcing; human intelligence task; spoken dialog system; user judgment; Amazon Mechanical Turk; Let´s Go; crowdsourcing; spoken dialog system; user judgment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2010 IEEE
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-7904-7
  • Electronic_ISBN
    978-1-4244-7902-3
  • Type

    conf

  • DOI
    10.1109/SLT.2010.5700864
  • Filename
    5700864