Graph partitioning has been one of fundamental problems in graph mining, and especially enforcing the partition to have balanced sizes has been challenging.
We propose a novel approach for graph partitioning.
First, we finds a large subgraph with high quality partitions, in terms of conductance. Despite the difficulty of the task for the whole graph, we observe that there is a large connected subgraph whose partition quality is much better than the original. Our proposed method MTP (Minus Top-k Partition) finds such a subgraph by removing hub nodes with large degrees, and taking the remaining giant connected component.
Second, we discover a global balanced partition by extending MTP to gbMTP (Global Balanced MTP). gbMTP assigns the excluded nodes in MTP to the partition found by MTP in a greedy way.
Our results are summarized as follows:
Source | Nodes | Edges | Description | Source | Nodes | Edges | Description |
---|---|---|---|---|---|---|---|
Advogato | 5,054 | 49,821 | Trust network | Slashdot | 51,083 | 116,573 | Reply network |
Oregon | 11,461 | 32,730 | Router connection | Epinions | 75,877 | 405,739 | Trust network |
CondMat | 21,363 | 91,286 | Collaboration network | Wordnet | 142,505 | 642,207 | Word association network |
Donations | 23,033 | 877,625 | Who donated whom | Gowalla | 196,591 | 950,327 | Online social network |
Enron | 33,695 | 91,286 | Enron email data | Amazon | 334,863 | 925,872 | Co-purchasing network |
Cit-HepPh | 34,401 | 420,784 | Citation network | Flickr | 404,733 | 2,111,078 | Social network in Flickr |