2013年8月7日 星期三

[研究] pegasus 2.0 安裝(CentOS 6.4 x64)

[研究] pegasus 2.0 安裝(CentOS 6.4 x64)

官方網站
http://www.cs.cmu.edu/~pegasus/

基本需求
- Hadoop 0.20.1 or greater from http://hadoop.apache.org/
- Apache Ant 1.7.0 or greater from http://ant.apache.org/
- Java 1.6.x or greater, preferably from Sun
- Python 2.4.x or greater
- Gnuplot 4.2.x or greater


安裝參考
http://www.cs.cmu.edu/~pegasus/getting%20started.htm

安裝

#安裝 Gnuplot 4.2.6、Python 2.6.6、Apache Ant 1.7.1
yum -y install gnuplot python ant

#安裝 Oracle Java JDK 1.7.x
rpm -ivh jdk-7u25-linux-x64.rpm

# 安裝 Apache Hadoop,請參考這篇
# [研究] Hadoop 1.2.1 安裝 (CentOS 6.4 x64)
# http://shaurong.blogspot.tw/2013/07/hadoop-112-centos-64-x64.html

# 讓任何地方都可以執行 hadoop
export PATH=$PATH:/home/hadoop/hadoop-1.2.1/bin

# 安裝 pegasus 2.0

wget http://www.cs.cmu.edu/~pegasus/PEGASUSH-2.0.tar.gz
tar zxvf PEGASUSH-2.0.tar.gz  -C  /usr/local/
export PATH=$PATH:/usr/local/PEGASUS

執行測試 ( 一定要切換到 cd /usr/local/PEGASUS 目錄)
[root@localhost ~]# cd /usr/local/PEGASUS
[root@localhost PEGASUS]# ./pegasus.sh

        PEGASUS: Peta-Scale Graph Mining System
        Version 2.0
        Last modified September 5th 2010

        Authors: U Kang, Duen Horng Chau, and Christos Faloutsos
                 School of Computer Science, Carnegie Mellon University
        Distributed under APL 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

        Type `help` for available commands.
        The PEGASUS user manual is available at http://www.cs.cmu.edu/~pegasus
        Send comments and help requests to <ukang@cs.cmu.edu>.

PEGASUS> help

        add [file or directory] [graph_name]
            upload a local graph file or directory to HDFS
        del [graph_name]
            delete a graph
        list
            list graphs
        compute ['deg' or 'pagerank' or 'rwr' or 'radius' or 'cc'] [graph_name]
            run an algorithm on a graph
        plot ['deg' or 'pagerank' or 'rwr' or 'radius' or 'cc' or 'corr'] [graph_name]
            generate plots
        exit
            exit PEGASUS
        help
            show this screen


PEGASUS> demo
Creating pegasus in HDFS
Creating pegasus/graphs in HDFS
Creating pegasus/graphs/catstar in HDFS
Creating pegasus/graphs/catstar/edge in HDFS
13/07/25 15:50:18 INFO util.NativeCodeLoader: Loaded the native-hadoop library
Graph catstar added.
rmr: cannot remove dd_node_deg: No such file or directory.
rmr: cannot remove dd_deg_count: No such file or directory.

-----===[PEGASUS: A Peta-Scale Graph Mining System]===-----

[PEGASUS] Computing degree distribution. Degree type = InOut

13/07/25 15:50:21 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/07/25 15:50:21 WARN snappy.LoadSnappy: Snappy native library not loaded
13/07/25 15:50:21 INFO mapred.FileInputFormat: Total input paths to process : 1
13/07/25 15:50:22 INFO mapred.JobClient: Running job: job_local1964987312_0001
13/07/25 15:50:22 INFO mapred.LocalJobRunner: Waiting for map tasks
13/07/25 15:50:22 INFO mapred.LocalJobRunner: Starting task: attempt_local1964987312_0001_m_000000_0
13/07/25 15:50:22 INFO util.ProcessTree: setsid exited with exit code 0
13/07/25 15:50:22 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculato                                                                                             rPlugin@7498cfd6
13/07/25 15:50:22 INFO mapred.MapTask: Processing split: file:/usr/local/PEGASUS/pegasus/graphs/catstar/edge/catepi                                                                                             llar_star.edge:0+66
13/07/25 15:50:22 INFO mapred.MapTask: numReduceTasks: 1
13/07/25 15:50:22 INFO mapred.MapTask: io.sort.mb = 100
13/07/25 15:50:23 INFO mapred.MapTask: data buffer = 79691776/99614720
13/07/25 15:50:23 INFO mapred.MapTask: record buffer = 262144/327680
MapPass1 : configure is called. degtype = 3
13/07/25 15:50:23 INFO mapred.MapTask: Starting flush of map output
13/07/25 15:50:23 INFO mapred.MapTask: Finished spill 0
13/07/25 15:50:23 INFO mapred.Task: Task:attempt_local1964987312_0001_m_000000_0 is done. And is in the process of                                                                                              commiting
13/07/25 15:50:23 INFO mapred.LocalJobRunner: file:/usr/local/PEGASUS/pegasus/graphs/catstar/edge/catepillar_star.e                                                                                             dge:0+66
13/07/25 15:50:23 INFO mapred.Task: Task 'attempt_local1964987312_0001_m_000000_0' done.
13/07/25 15:50:23 INFO mapred.LocalJobRunner: Finishing task: attempt_local1964987312_0001_m_000000_0
13/07/25 15:50:23 INFO mapred.LocalJobRunner: Map task executor complete.
13/07/25 15:50:23 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculato                                                                                             rPlugin@583c9dd8
13/07/25 15:50:23 INFO mapred.LocalJobRunner:
13/07/25 15:50:23 INFO mapred.Merger: Merging 1 sorted segments
13/07/25 15:50:23 INFO mapred.JobClient:  map 100% reduce 0%
13/07/25 15:50:23 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 282 bytes
13/07/25 15:50:23 INFO mapred.LocalJobRunner:
RedPass1 : configure is called. degtype = 3
13/07/25 15:50:23 INFO mapred.Task: Task:attempt_local1964987312_0001_r_000000_0 is done. And is in the process of                                                                                              commiting
13/07/25 15:50:23 INFO mapred.LocalJobRunner:
13/07/25 15:50:23 INFO mapred.Task: Task attempt_local1964987312_0001_r_000000_0 is allowed to commit now
13/07/25 15:50:23 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local1964987312_0001_r_000000_0' t                                                                                             o file:/usr/local/PEGASUS/dd_node_deg
13/07/25 15:50:23 INFO mapred.LocalJobRunner: reduce > reduce
13/07/25 15:50:23 INFO mapred.Task: Task 'attempt_local1964987312_0001_r_000000_0' done.
13/07/25 15:50:24 INFO mapred.JobClient:  map 100% reduce 100%
13/07/25 15:50:24 INFO mapred.JobClient: Job complete: job_local1964987312_0001
13/07/25 15:50:24 INFO mapred.JobClient: Counters: 21
13/07/25 15:50:24 INFO mapred.JobClient:   File Input Format Counters
13/07/25 15:50:24 INFO mapred.JobClient:     Bytes Read=78
13/07/25 15:50:24 INFO mapred.JobClient:   File Output Format Counters
13/07/25 15:50:24 INFO mapred.JobClient:     Bytes Written=82
13/07/25 15:50:24 INFO mapred.JobClient:   FileSystemCounters
13/07/25 15:50:24 INFO mapred.JobClient:     FILE_BYTES_READ=523964
13/07/25 15:50:24 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=629870
13/07/25 15:50:24 INFO mapred.JobClient:   Map-Reduce Framework
13/07/25 15:50:24 INFO mapred.JobClient:     Map output materialized bytes=286
13/07/25 15:50:24 INFO mapred.JobClient:     Map input records=14
13/07/25 15:50:24 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/07/25 15:50:24 INFO mapred.JobClient:     Spilled Records=56
13/07/25 15:50:24 INFO mapred.JobClient:     Map output bytes=224
13/07/25 15:50:24 INFO mapred.JobClient:     Total committed heap usage (bytes)=320610304
13/07/25 15:50:24 INFO mapred.JobClient:     CPU time spent (ms)=0
13/07/25 15:50:24 INFO mapred.JobClient:     Map input bytes=66
13/07/25 15:50:24 INFO mapred.JobClient:     SPLIT_RAW_BYTES=125
13/07/25 15:50:24 INFO mapred.JobClient:     Combine input records=0
13/07/25 15:50:24 INFO mapred.JobClient:     Reduce input records=28
13/07/25 15:50:24 INFO mapred.JobClient:     Reduce input groups=16
13/07/25 15:50:24 INFO mapred.JobClient:     Combine output records=0
13/07/25 15:50:24 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/07/25 15:50:24 INFO mapred.JobClient:     Reduce output records=16
13/07/25 15:50:24 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/07/25 15:50:24 INFO mapred.JobClient:     Map output records=28
13/07/25 15:50:24 INFO mapred.FileInputFormat: Total input paths to process : 1
13/07/25 15:50:24 INFO mapred.JobClient: Running job: job_local1536447597_0002
13/07/25 15:50:24 INFO mapred.LocalJobRunner: Waiting for map tasks
13/07/25 15:50:24 INFO mapred.LocalJobRunner: Starting task: attempt_local1536447597_0002_m_000000_0
13/07/25 15:50:24 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculato                                                                                             rPlugin@1b64dd8e
13/07/25 15:50:24 INFO mapred.MapTask: Processing split: file:/usr/local/PEGASUS/dd_node_deg/part-00000:0+70
13/07/25 15:50:24 INFO mapred.MapTask: numReduceTasks: 1
13/07/25 15:50:24 INFO mapred.MapTask: io.sort.mb = 100
13/07/25 15:50:25 INFO mapred.MapTask: data buffer = 79691776/99614720
13/07/25 15:50:25 INFO mapred.MapTask: record buffer = 262144/327680
13/07/25 15:50:25 INFO mapred.MapTask: Starting flush of map output
13/07/25 15:50:25 INFO mapred.MapTask: Finished spill 0
13/07/25 15:50:25 INFO mapred.Task: Task:attempt_local1536447597_0002_m_000000_0 is done. And is in the process of                                                                                              commiting
13/07/25 15:50:25 INFO mapred.LocalJobRunner: file:/usr/local/PEGASUS/dd_node_deg/part-00000:0+70
13/07/25 15:50:25 INFO mapred.Task: Task 'attempt_local1536447597_0002_m_000000_0' done.
13/07/25 15:50:25 INFO mapred.LocalJobRunner: Finishing task: attempt_local1536447597_0002_m_000000_0
13/07/25 15:50:25 INFO mapred.LocalJobRunner: Map task executor complete.
13/07/25 15:50:25 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculato                                                                                             rPlugin@604f2d14
13/07/25 15:50:25 INFO mapred.LocalJobRunner:
13/07/25 15:50:25 INFO mapred.Merger: Merging 1 sorted segments
13/07/25 15:50:25 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 32 bytes
13/07/25 15:50:25 INFO mapred.LocalJobRunner:
13/07/25 15:50:25 INFO mapred.Task: Task:attempt_local1536447597_0002_r_000000_0 is done. And is in the process of                                                                                              commiting
13/07/25 15:50:25 INFO mapred.LocalJobRunner:
13/07/25 15:50:25 INFO mapred.Task: Task attempt_local1536447597_0002_r_000000_0 is allowed to commit now
13/07/25 15:50:25 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local1536447597_0002_r_000000_0' t                                                                                             o file:/usr/local/PEGASUS/dd_deg_count
13/07/25 15:50:25 INFO mapred.LocalJobRunner: reduce > reduce
13/07/25 15:50:25 INFO mapred.Task: Task 'attempt_local1536447597_0002_r_000000_0' done.
13/07/25 15:50:25 INFO mapred.JobClient:  map 100% reduce 100%
13/07/25 15:50:25 INFO mapred.JobClient: Job complete: job_local1536447597_0002
13/07/25 15:50:25 INFO mapred.JobClient: Counters: 21
13/07/25 15:50:25 INFO mapred.JobClient:   File Input Format Counters
13/07/25 15:50:25 INFO mapred.JobClient:     Bytes Read=82
13/07/25 15:50:25 INFO mapred.JobClient:   File Output Format Counters
13/07/25 15:50:25 INFO mapred.JobClient:     Bytes Written=25
13/07/25 15:50:25 INFO mapred.JobClient:   FileSystemCounters
13/07/25 15:50:25 INFO mapred.JobClient:     FILE_BYTES_READ=1047920
13/07/25 15:50:25 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1259313
13/07/25 15:50:25 INFO mapred.JobClient:   Map-Reduce Framework
13/07/25 15:50:25 INFO mapred.JobClient:     Map output materialized bytes=36
13/07/25 15:50:25 INFO mapred.JobClient:     Map input records=16
13/07/25 15:50:25 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/07/25 15:50:25 INFO mapred.JobClient:     Spilled Records=6
13/07/25 15:50:25 INFO mapred.JobClient:     Map output bytes=128
13/07/25 15:50:25 INFO mapred.JobClient:     Total committed heap usage (bytes)=320610304
13/07/25 15:50:25 INFO mapred.JobClient:     CPU time spent (ms)=0
13/07/25 15:50:25 INFO mapred.JobClient:     Map input bytes=70
13/07/25 15:50:25 INFO mapred.JobClient:     SPLIT_RAW_BYTES=99
13/07/25 15:50:25 INFO mapred.JobClient:     Combine input records=16
13/07/25 15:50:25 INFO mapred.JobClient:     Reduce input records=3
13/07/25 15:50:25 INFO mapred.JobClient:     Reduce input groups=3
13/07/25 15:50:25 INFO mapred.JobClient:     Combine output records=3
13/07/25 15:50:25 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/07/25 15:50:25 INFO mapred.JobClient:     Reduce output records=3
13/07/25 15:50:25 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/07/25 15:50:25 INFO mapred.JobClient:     Map output records=16

[PEGASUS] Degree distribution computed.
[PEGASUS] (NodeId, Degree) is saved in HDFS dd_node_deg, (Degree, Count) is saved in HDFS dd_deg_count

Creating pegasus/graphs/catstar/results in HDFS
Creating pegasus/graphs/catstar/results/deg in HDFS
Creating pegasus/graphs/catstar/results/deg/inout in HDFS
inout-degree distribution plotted in "catstar_deg_inout.eps".
PEGASUS> exit
[root@localhost PEGASUS]#

如果不切換到 cd /usr/local/PEGASUS 目錄,執行會錯誤,如下:

PEGASUS> demo
Creating pegasus in HDFS
Creating pegasus/graphs in HDFS
Creating pegasus/graphs/catstar in HDFS
Creating pegasus/graphs/catstar/edge in HDFS
Error: catepillar_star.edge is not a regular file or directory.
/usr/local/PEGASUS/pegasus.sh: line 164: ./run_dd.sh: No such file or directory
Creating pegasus/graphs/catstar/results in HDFS
Creating pegasus/graphs/catstar/results/deg in HDFS
Creating pegasus/graphs/catstar/results/deg/inout in HDFS
mv: File dd_node_deg does not exist.
mv: File dd_deg_count does not exist.
cp: cannot stat `pegasus_deg_template.plt': No such file or directory
Error: can't mine inout degree of the graph catstar. Check whether the inout degree is computed, or gnuplot is installed correctly.
PEGASUS>

(完)

[研究] pegasus 2.0 安裝(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/08/pegasus-20-centos-64-x64.html
http://forum.icst.org.tw/phpbb/viewtopic.php?t=80071

沒有留言:

張貼留言