BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart

Overview

How can we measure similarity between nodes quickly and accurately on large graphs? Random walk with restart (RWR) provides a good measure, and has been used in various data mining applications including ranking, recommendation, link prediction and community detection. However, existing methods for computing RWR do not scale to large graphs containing billions of edges; iterative methods are slow in query time, and preprocessing methods require too much memory.

In this paper, we propose BePI, a scalable, accurate, and fast method for computing RWR on billion-scale graphs. BePI exploits the best properties from both preprocessing methods and iterative methods. BePI uses a block elimination approach, which is a preprocessing method, to enable fast query time. Also, BePI uses a preconditioned iterative method to decrease memory requirement. The performance of BePI is further improved by decreasing non-zeros of the matrix for the iterative method. Through extensive experiments, we show that BePI processes 100× larger graphs, and requires up to 130× less memory space than other preprocessing methods. In the query phase, BePI computes RWR scores up to 9× faster than existing methods.


Paper

BePI is described in the following paper:


Code

The code of the methods used in the paper is available here.


Datasets

Name#Nodes#EdgesDescriptionSourceDownload
Slashdot79K516K Social network of users in the technology news site Slashdot DAI-Labor Link
Wikipedia100K1.6M Hyperlink network between articles of the English Wikipedia KONECT (The Koblenz Network Collection) Link
Baidu416K3.3M Hyperlink network between articles of the Chinese online encyclopedia Baidu Zhishi.me Link
Flickr2.3M33.1M Flickr friendship network Max Planck Institute for Software Systems Link
LiveJournal4.8M68.5M LiveJournal friendship network Stanford Large Network Dataset Collection Link
WikiLink11.2M340.2M Wiki-links of the English Wikipedia Wikimedia Link
Twitter41.7M1.5B Twitter follower network Advanced Networking Lab at Kaist Link
Friendster68.3M2.6B Friendster friendship network Internet Archive Link