• DocumentCode
    2119189
  • Title

    Malicious URL Detection Based on Kolmogorov Complexity Estimation

  • Author

    Hsing-Kuo Pao ; Yan-Lin Chou ; Yuh-Jye Lee

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
  • Volume
    1
  • fYear
    2012
  • fDate
    4-7 Dec. 2012
  • Firstpage
    380
  • Lastpage
    387
  • Abstract
    Malicious URL detection has drawn a significant research attention in recent years. It is helpful if we can simply use the URL string to make precursory judgment about how dangerous a Web site is. By doing that, we can save efforts on the Web site content analysis and bandwidth for content retrieval. We propose a detection method that is based on an estimation of the conditional Kolmogorov complexity of URL strings. To overcome the incomputability of Kolmogorov complexity, we adopt a compression method for its approximation, called conditional Kolmogorov measure. As a single significant feature for detection, we can achieve a decent performance that can not be achieved by any other single feature that we know. Moreover, the proposed Kolmogorov measure can work together with other features for a successful detection. The experiment has been conducted using a private dataset from a commercial company which can collect more than one million unclassified URLs in a typical hour. On average, the proposed measure can process such hourly data in less than a few minutes.
  • Keywords
    Web sites; computational complexity; data compression; information retrieval; invasive software; Kolmogorov complexity incomputability; Web site content analysis; commercial company; compression method; conditional Kolmogorov complexity estimation; conditional Kolmogorov measure; content retrieval bandwidth; dangerous Web sites; malicious URL detection; private dataset; unclassified URL string; Kolmogorov complexity; blacklist; compression; entropy; malicious URL;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
  • Conference_Location
    Macau
  • Print_ISBN
    978-1-4673-6057-9
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2012.258
  • Filename
    6511912