• DocumentCode
    22580
  • Title

    Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage

  • Author

    Huffner, Falk ; Komusiewicz, Christian ; Liebtrau, Adrian ; Niedermeier, Rolf

  • Author_Institution
    Inst. fur Softwaretech. und Theor. Inf., Tech. Univ. Berlin, Berlin, Germany
  • Volume
    11
  • Issue
    3
  • fYear
    2014
  • fDate
    May-June 2014
  • Firstpage
    455
  • Lastpage
    467
  • Abstract
    A popular clustering algorithm for biological networks which was proposed by Hartuv and Shamir identifies nonoverlapping highly connected components. We extend the approach taken by this algorithm by introducing the combinatorial optimization problem Highly Connected Deletion, which asks for removing as few edges as possible from a graph such that the resulting graph consists of highly connected components. We show that Highly Connected Deletion is NP-hard and provide a fixed-parameter algorithm and a kernelization. We propose exact and heuristic solution strategies, based on polynomial-time data reduction rules and integer linear programming with column generation. The data reduction typically identifies 75 percent of the edges that are deleted for an optimal solution; the column generation method can then optimally solve protein interaction networks with up to 6,000 vertices and 13,500 edges within five hours. Additionally, we present a new heuristic that finds more clusters than the method by Hartuv and Shamir.
  • Keywords
    biochemistry; biology computing; data reduction; graphs; heuristic programming; integer programming; linear programming; molecular clusters; optimisation; pattern clustering; polynomials; Hartuv method; Shamir method; clustering algorithm; column generation; column generation method; combinatorial optimization problem HIGHLY CONNECTED DELETION; fixed-parameter algorithm; graph; heuristic solution strategies; highly connected clusters; integer linear programming; kernelization; maximum edge coverage; nonoverlapping highly connected components; partitioning biological networks; polynomial-time data reduction rules; protein interaction networks; time 5 h; vertices; Algorithm design and analysis; Clustering algorithms; Kernel; Partitioning algorithms; Proteins; Transforms; Cluster analysis; PPI networks; data reduction; fixed-parameter tractability; heuristics; integer linear programming;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2013.177
  • Filename
    6682910