• DocumentCode
    259566
  • Title

    Reducing the Cost of Breaking Audio CAPTCHAs by Active and Semi-supervised Learning

  • Author

    Darnstadt, Malte ; Meutzner, Hendrik ; Kolossa, Dorothea

  • Author_Institution
    Fac. of Math., Ruhr-Univ. Bochum, Bochum, Germany
  • fYear
    2014
  • fDate
    3-6 Dec. 2014
  • Firstpage
    67
  • Lastpage
    73
  • Abstract
    CAPTCHAs are challenge-response tests that are widely used in the Internet to distinguish human users from machines. In addition to the well-known visual CAPTCHAs, most Internet services also provide an audio-based scheme, e.g., To enable access for visually impaired users. Recent research has shown that most CAPTCHAs are vulnerable as they can be broken by machine learning techniques. However, such automated attacks come at a relatively high cost as they require human experts to create labels for the unlabeled CAPTCHA samples collected from a website in order to train an attacking system. In this work we utilize active and semi-supervised learning methods for breaking audio CAPTCHAs. We show that these methods can reduce the labeling costs considerably, resulting in an increased vulnerability of audio CAPTCHAs as automated attacks are rendered even more worthwhile. In addition, our findings give insight into improvements to the design of CAPTCHAs, helping to harden prospective audio CAPTCHA schemes against active learning attacks in the future.
  • Keywords
    authorisation; cost reduction; learning (artificial intelligence); speech recognition; Internet services; Web sites; active learning attacks; attacking system; audio CAPTCHA breaking; audio-based scheme; automated attacks; challenge-response tests; label creation; labeling cost reduction; machine learning techniques; prospective audio CAPTCHA schemes; semisupervised learning; unlabeled CAPTCHA; visually impaired users; CAPTCHAs; Error analysis; Hidden Markov models; Noise; Semisupervised learning; Speech; Viterbi algorithm; active learning; audio CAPTCHA; automatic speech recognition; semi-supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2014 13th International Conference on
  • Conference_Location
    Detroit, MI
  • Type

    conf

  • DOI
    10.1109/ICMLA.2014.16
  • Filename
    7033093