2013年10月19日 星期六

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

2013-10-19

Hadoop 是個架設雲端的系統,它參考Google Filesystem,以Java開發,提供HDFS與MapReduce API。

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.

官方網站
http://hadoop.apache.org/

安裝參考
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleNodeSetup.html

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

下載
http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-1.2.1/

檔案:hadoop-1.2.1-1.x86_64.rpm
http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-1.x86_64.rpm

Oracle Java Runtime : jre-7u40-linux-x64.rpm
http://www.oracle.com/technetwork/java/javase/downloads/index.html

請自己先下載 hadoop-1.2.1-1.x86_64.rpm 和 jre-7u40-linux-x64.rpm

一、準備工作

1.安裝基本套件

[root@localhost ~]# yum -y install openssh rsync

安裝 Java Runtime

[root@localhost ~]# rpm -ivh  jre-7u40-linux-x64.rpm

查一下 Java 相關套件安裝情況

[root@localhost ~]# find / -name java
/usr/java
/usr/java/jre1.7.0_40/bin/java
/usr/lib/java
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.9.x86_64/jre/bin/java
/usr/share/java
/usr/lib64/libreoffice/ure/share/java
/usr/lib64/libreoffice/basis3.4/share/Scripts/java
/usr/bin/java
/etc/java
/etc/alternatives/java
/etc/pki/java
/var/lib/alternatives/java
[root@localhost ~]#

安裝 Hadoop

[root@localhost ~]# rpm -ivh  hadoop-1.2.1-1.x86_64.rpm

查一下 hadoop 安裝到了哪些目錄

[root@localhost ~]# find / -name hadoop
/usr/include/hadoop
/usr/etc/hadoop
/usr/share/hadoop
/usr/share/doc/hadoop
/usr/bin/hadoop
/etc/hadoop
/var/log/hadoop
/var/run/hadoop
[root@localhost ~]#

測試一下 hadoop 可否執行

[root@localhost ~]# hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  mradmin              run a Map-Reduce admin client
  fsck                 run a DFS filesystem checking utility
  fs                   run a generic filesystem user client
  balancer             run a cluster balancing utility
  oiv                  apply the offline fsimage viewer to an fsimage
  fetchdt              fetch a delegation token from the NameNode
  jobtracker           run the MapReduce job Tracker node
  pipes                run a Pipes job
  tasktracker          run a MapReduce task Tracker node
  historyserver        run job history servers as a standalone daemon
  job                  manipulate MapReduce jobs
  queue                get information regarding JobQueues
  version              print the version
  jar <jar>            run a jar file
  distcp <srcurl> <desturl> copy file or directories recursively
  distcp2 <srcurl> <desturl> DistCp version 2
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  classpath            prints the class path needed to get the
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
[root@localhost ~]#

顯示 hadoop 版本

[root@localhost ~]# hadoop version
Hadoop 1.2.1
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:27:42 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /usr/share/hadoop/hadoop-core-1.2.1.jar
[root@localhost ~]#

檢查 Hadoop 新建立的帳號

[root@localhost ~]# cat /etc/passwd | grep Hadoop
mapred:x:202:123:Hadoop MapReduce:/tmp:/bin/bash
hdfs:x:201:123:Hadoop HDFS:/tmp:/bin/bash
[root@localhost ~]#

查一下 hadoop 範例在哪

[root@localhost ~]# find / -name hadoop-examples-1.2.1.jar
/usr/share/hadoop/hadoop-examples-1.2.1.jar

各種 hadoop 設定檔

[root@localhost ~]# ls /etc/hadoop
capacity-scheduler.xml      hadoop-policy.xml      slaves
configuration.xsl           hdfs-site.xml          ssl-client.xml.example
core-site.xml               log4j.properties       ssl-server.xml.example
fair-scheduler.xml          mapred-queue-acls.xml  taskcontroller.cfg
hadoop-env.sh               mapred-site.xml        task-log4j.properties
hadoop-metrics2.properties  masters
[root@localhost ~]#

修改3個設定檔案 core-site.xml、hdfs-site.xml、mapred-site.xml

[root@localhost ~]# cat /etc/hadoop/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

</configuration>

[root@localhost ~]# cat /etc/hadoop/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

</configuration>
[root@localhost ~]# cat /etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

</configuration>
[root@localhost ~]#


分別改為

[root@localhost ~]# vim /etc/hadoop/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>

[root@localhost ~]# vim /etc/hadoop/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

[root@localhost ~]# vim /etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>

(PS:或分別改為(未測試)

[root@localhost ~]# vim /etc/hadoop/core-site.xml
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

[root@localhost ~]# vim conf/hdfs-site.xml

改為

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>


[root@localhost ~]# vim conf/mapred-site.xml

改為

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>

設定 ssh 連線免輸入密碼

[root@localhost ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
ca:04:30:8d:be:bd:91:a2:c3:c4:94:cf:18:c3:43:cb root@localhost.localdomain
The key's randomart image is:
+--[ DSA 1024]----+
|  oo             |
| ..o.            |
|+.o .            |
| E.  .           |
|o Bo .. S        |
| +oo+o .         |
|o. . oo          |
|o.  .            |
| .               |
+-----------------+
[root@localhost ~]#


[hadoop@localhost ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@localhost ~]$ chmod 600 .ssh/authorized_keys

格式化分散式檔案系統

[root@localhost ~]# hadoop namenode -format
13/10/19 19:57:08 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:27:42 PDT 2013
STARTUP_MSG:   java = 1.7.0_40
************************************************************/
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
13/10/19 19:57:13 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
[root@localhost ~]#

啟動

[root@localhost ~]# start-all.sh
starting namenode, logging to /var/log/hadoop/root/hadoop-root-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /var/log/hadoop/root/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /var/log/hadoop/root/hadoop-root-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost ~]#



瀏覽 NameNode 和 JobTracker 網頁介面,預設為

NameNode
http://localhost:50070/




JobTracker
http://localhost:50030/





停止

[root@localhost ~]# stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
[root@localhost ~]#


(因為有人問 rpm 版安裝方法,趕著寫,很多還沒搞清楚,待續...)


相關

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-centos-64-x64.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (二)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64_7.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (一)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64.html

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/10/hadoop-121-rpm-centos-64-x64.html

[研究] Hadoop 1.2.1 (bin)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/07/hadoop-112-centos-64-x64.html

[研究] Hadoop 1.2.1 安裝 (CentOS 6.4 x64)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=80035

[研究] 雲端軟體 Hadoop 1.0.0 安裝 (CentOS 6.2 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=21166

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.5 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=18513

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.4 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=17974

沒有留言:

張貼留言