Given a time-evolving graph, how can we track similarity between nodes in a fast and accurate way, with theoretical guarantees on the convergence and the error? Random Walk with Restart (RWR) is a popular measure to estimate the similarity between nodes and has been exploited in numerous applications. Many real-world graphs are dynamic with frequent insertion/deletion of edges; thus, tracking RWR scores on dynamic graphs in an efficient way has aroused much interest among data mining researchers. Recently, dynamic RWR models based on the propagation of scores across a given graph have been proposed, and have succeeded in outperforming previous other approaches to compute RWR dynamically. However, those models fail to guarantee exactness and convergence time for updating RWR in a generalized form.
In this paper, we propose OSP, a fast and accurate algorithm for computing dynamic RWR with insertion/deletion of nodes/edges in a directed/undirected graph. When the graph is updated, OSP first calculates o set scores around the modi ed edges, propagates the offset scores across the updated graph, and then merges them with the current RWR scores to get updated RWR scores. We prove the exactness of OSP and introduce OSP-T, a version of OSP which regulates a trade-o between accuracy and computation time by using error tolerance ε. Given restart probability c, OSP-T guarantees to return RWR scores with O(ε/c) error in O(log(ε/2)/log(1−c))iterations. Through extensive experiments, we show that OSP tracks RWR exactly up to 4605× faster than existing static RWR method on dynamic graphs, and OSP-T requires up to 15× less time with 730× lower L1 norm error and 3.3× lower rank error than other state-of-the-art dynamic RWR methods.
OSP is described in the following paper:
The code of the methods used in the paper is available here.
|WikiLink||12,150,976||378,142,420||Wiki-links of the English Wikipedia||Wikimedia||Link|
|Orkut||3,072,441||117,184,899||Social network of Orkut users and their connections||The Max Plank Institute for Software Systems||Link|
|LiveJournal||4,847,571||68,475,391||LiveJournal friendship network||Stanford Large Network Dataset Collection||Link|
|Berkstan||685,230||7,600,595||Web graph of Berkeley and Stanford||SNAP||Link|
|DBLP||317,080||1,049,866||Co-authorship network of the DBLP computer science bibliography||SNAP||Link|
|Slashdot||82,144||549,202||Social network of users in the technology news site Slashdot||DAI-Labor||Link|