FAQ

Using PegasusN

Q. How do I prepare a graph?

A. PegasusN works on graphs in TAB-separated plain text format. Each line corresponds to an edge, and contains indices of each node. The index starts from 0. Here is an example. The first line is an edge with node 0 and node 1, and the second line is an edge with node 0 and node 2.

0   1
0   2
0   3
0   4
1   4
2   3
2   4
4   5

Q. How do I setup Hadoop or Spark cluster?

A. In most cases, you will use a Hadoop or Spark cluster set up by someone else, and thus you don't need to set up a cluster by yourself. However, you might want to set up a cluster for yourself for some occasions. In that case, here are several useful documents:

Hadoop Cluster Setup
Spark Documentation
Cloudera Hadoop Distribution



About PegasusN

Q. What are the main advantages of PegasusN?

A. The most important advantage of PegasusN is the scalability. PegasusN provides graph mining algorithms for peta-scale graphs which are serveral orders of magnitude larger than previous works can handle. Other advantages include the ability to analyze graph data on Hadoop and Spark clusters.