• DocumentCode
    743527
  • Title

    AnDarwin: Scalable Detection of Android Application Clones Based on Semantics

  • Author

    Crussell, Jonathan ; Gibler, Clint ; Hao Chen

  • Author_Institution
    Comput. Sci., Univ. of California, Davis, Davis, CA, USA
  • Volume
    14
  • Issue
    10
  • fYear
    2015
  • Firstpage
    2007
  • Lastpage
    2019
  • Abstract
    Smartphones rely on their vibrant application markets; however, plagiarism threatens the long-term health of these markets. We present a scalable approach to detecting similar Android apps based on their semantic information. We implement our approach in a tool called AnDarwin and evaluate it on 265,359 apps collected from 17 markets including Google Play and numerous third-party markets. In contrast to earlier approaches, AnDarwin has four advantages: it avoids comparing apps pairwise, thus greatly improving its scalability; it analyzes only the app code and does not rely on other information-such as the app´s market, signature, or description-thus greatly increasing its reliability; it can detect both full and partial app similarity; and it can automatically detect library code and remove it from the similarity analysis. We present two use cases for AnDarwin: finding similar apps by different developers (“clones”) and similar apps from the same developer (“rebranded”). In 10 hours, AnDarwin detected at least 4,295 apps that are the victims of cloning and 36,106 rebranded apps. Additionally, AnDarwin detects similar code that is injected into many apps, which may indicate the spread of malware. Our evaluation demonstrates AnDarwin´s ability to accurately detect similar apps on a large scale.
  • Keywords
    Android (operating system); mobile computing; program diagnostics; programming language semantics; AnDarwin; Android application clones detection; Google Play; app code; library code detection; malware; semantic information; third-party markets; Cloning; Feature extraction; Libraries; Malware; Semantics; Smart phones; Vectors; Program analysis; clustering; mobile applications; plagiarism detection;
  • fLanguage
    English
  • Journal_Title
    Mobile Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1536-1233
  • Type

    jour

  • DOI
    10.1109/TMC.2014.2381212
  • Filename
    6985631