• DocumentCode
    3222624
  • Title

    Balance sheet outlier detection using a graph similarity algorithm

  • Author

    Yang, Songping ; Cogill, Randy

  • Author_Institution
    Financial Eng. Program in Sch. of Syst. & Enterprises, Stevens Inst. of Technol., Hoboken, NJ, USA
  • fYear
    2013
  • fDate
    16-19 April 2013
  • Firstpage
    135
  • Lastpage
    142
  • Abstract
    Graph similarity measurement has been used in many applications, such as computational biology, text mining, pattern recognition, and computer vision. In this paper, we apply similarity measurement on graphs to measure structural differences in financial statements. Unconventional financial statement structures may potentially reveal deceptive intention of hiding certain information while making technically “correct” financial statements. Furthermore, unconventional financial statements may also lead to investment opportunities if legitimacy is not questioned. We construct an algorithm based on the metric of string edit distance as an approximation of graph similarity, and apply the Levenshtein algorithm with modified string edit costs to measure string edit distance. We demonstrate the effectiveness of this algorithm in capturing the sensitive changes of balance sheet structures by applying the algorithm in two experiments. The first experiment shows the algorithm is sensitive to all three basic edits (namely deletion, insertion and substitution) on a particular balance sheet, and the second experiment shows more than 90% clustering accuracy on real balance sheets.
  • Keywords
    financial data processing; graph theory; pattern clustering; text analysis; Levenshtein algorithm; balance sheet outlier detection; balance sheet structure; clustering accuracy; financial statement structure; graph similarity algorithm; graph similarity measurement; information hiding; string edit cost; string edit distance; structural difference measurement; technically correct financial statement; Approximation algorithms; Approximation methods; Companies; Heuristic algorithms; Industries; Measurement; Power systems; Balance sheet; Graph similarity metric; Hierarchical clustering; Outliers detection; String edit distance; XBRL;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence for Financial Engineering & Economics (CIFEr), 2013 IEEE Conference on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/CIFEr.2013.6611709
  • Filename
    6611709