浮雲雅築: [研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

2013-10-19

Hadoop 是個架設雲端的系統，它參考Google Filesystem，以Java開發，提供HDFS與MapReduce API。

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.

官方網站
http://hadoop.apache.org/

安裝參考
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleNodeSetup.html

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

下載
http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-1.2.1/

檔案：hadoop-1.2.1-1.x86_64.rpm
http://apache.cdpa.nsysu.edu.tw/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-1.x86_64.rpm

Oracle Java Runtime : jre-7u40-linux-x64.rpm
http://www.oracle.com/technetwork/java/javase/downloads/index.html

請自己先下載 hadoop-1.2.1-1.x86_64.rpm 和 jre-7u40-linux-x64.rpm

一、準備工作

1.安裝基本套件

[root@localhost ~]# yum -y install openssh rsync

安裝 Java Runtime

[root@localhost ~]# rpm -ivh jre-7u40-linux-x64.rpm

查一下 Java 相關套件安裝情況

[root@localhost ~]# find / -name java
/usr/java
/usr/java/jre1.7.0_40/bin/java
/usr/lib/java
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.9.x86_64/jre/bin/java
/usr/share/java
/usr/lib64/libreoffice/ure/share/java
/usr/lib64/libreoffice/basis3.4/share/Scripts/java
/usr/bin/java
/etc/java
/etc/alternatives/java
/etc/pki/java
/var/lib/alternatives/java
[root@localhost ~]#

安裝 Hadoop

[root@localhost ~]# rpm -ivh hadoop-1.2.1-1.x86_64.rpm

查一下 hadoop 安裝到了哪些目錄

[root@localhost ~]# find / -name hadoop
/usr/include/hadoop
/usr/etc/hadoop
/usr/share/hadoop
/usr/share/doc/hadoop
/usr/bin/hadoop
/etc/hadoop
/var/log/hadoop
/var/run/hadoop
[root@localhost ~]#

測試一下 hadoop 可否執行

[root@localhost ~]# hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
datanode run a DFS datanode
dfsadmin run a DFS admin client
mradmin run a Map-Reduce admin client
fsck run a DFS filesystem checking utility
fs run a generic filesystem user client
balancer run a cluster balancing utility
oiv apply the offline fsimage viewer to an fsimage
fetchdt fetch a delegation token from the NameNode
jobtracker run the MapReduce job Tracker node
pipes run a Pipes job
tasktracker run a MapReduce task Tracker node
historyserver run job history servers as a standalone daemon
job manipulate MapReduce jobs
queue get information regarding JobQueues
version print the version
jar <jar> run a jar file
distcp <srcurl> <desturl> copy file or directories recursively
distcp2 <srcurl> <desturl> DistCp version 2
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
[root@localhost ~]#

顯示 hadoop 版本

[root@localhost ~]# hadoop version
Hadoop 1.2.1
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:27:42 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /usr/share/hadoop/hadoop-core-1.2.1.jar
[root@localhost ~]#

檢查 Hadoop 新建立的帳號

[root@localhost ~]# cat /etc/passwd | grep Hadoop
mapred:x:202:123:Hadoop MapReduce:/tmp:/bin/bash
hdfs:x:201:123:Hadoop HDFS:/tmp:/bin/bash
[root@localhost ~]#

查一下 hadoop 範例在哪

[root@localhost ~]# find / -name hadoop-examples-1.2.1.jar
/usr/share/hadoop/hadoop-examples-1.2.1.jar

各種 hadoop 設定檔

[root@localhost ~]# ls /etc/hadoop
capacity-scheduler.xml hadoop-policy.xml slaves
configuration.xsl hdfs-site.xml ssl-client.xml.example
core-site.xml log4j.properties ssl-server.xml.example
fair-scheduler.xml mapred-queue-acls.xml taskcontroller.cfg
hadoop-env.sh mapred-site.xml task-log4j.properties
hadoop-metrics2.properties masters
[root@localhost ~]#

修改3個設定檔案 core-site.xml、hdfs-site.xml、mapred-site.xml

[root@localhost ~]# cat /etc/hadoop/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

</configuration>

[root@localhost ~]# cat /etc/hadoop/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

</configuration>
[root@localhost ~]# cat /etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

</configuration>
[root@localhost ~]#

分別改為

[root@localhost ~]# vim /etc/hadoop/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost/</value>
</property>
</configuration>

[root@localhost ~]# vim /etc/hadoop/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

[root@localhost ~]# vim /etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>

(PS:或分別改為(未測試)

[root@localhost ~]# vim /etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

[root@localhost ~]# vim conf/hdfs-site.xml

改為

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

[root@localhost ~]# vim conf/mapred-site.xml

改為

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

設定 ssh 連線免輸入密碼

[root@localhost ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
ca:04:30:8d:be:bd:91:a2:c3:c4:94:cf:18:c3:43:cb root@localhost.localdomain
The key's randomart image is:
+--[ DSA 1024]----+
| oo |
| ..o. |
|+.o . |
| E. . |
|o Bo .. S |
| +oo+o . |
|o. . oo |
|o. . |
| . |
+-----------------+
[root@localhost ~]#

[hadoop@localhost ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@localhost ~]$ chmod 600 .ssh/authorized_keys

格式化分散式檔案系統

[root@localhost ~]# hadoop namenode -format
13/10/19 19:57:08 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:27:42 PDT 2013
STARTUP_MSG: java = 1.7.0_40
************************************************************/
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) y
Format aborted in /tmp/hadoop-root/dfs/name
13/10/19 19:57:13 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
[root@localhost ~]#

啟動

[root@localhost ~]# start-all.sh
starting namenode, logging to /var/log/hadoop/root/hadoop-root-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /var/log/hadoop/root/hadoop-root-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /var/log/hadoop/root/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /var/log/hadoop/root/hadoop-root-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /var/log/hadoop/root/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost ~]#

瀏覽 NameNode 和 JobTracker 網頁介面，預設為

NameNode
http://localhost:50070/

JobTracker
http://localhost:50030/

停止

[root@localhost ~]# stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
[root@localhost ~]#

(因為有人問 rpm 版安裝方法，趕著寫，很多還沒搞清楚，待續...)

相關

[研究] Hadoop 2.2.0 編譯 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-centos-64-x64.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (二)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64_7.html

[研究] Hadoop 2.2.0 Single Cluster 安裝 (一)(CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/11/hadoop-220-single-cluster-centos-64-x64.html

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/10/hadoop-121-rpm-centos-64-x64.html

[研究] Hadoop 1.2.1 (bin)安裝 (CentOS 6.4 x64)
http://shaurong.blogspot.tw/2013/07/hadoop-112-centos-64-x64.html

[研究] Hadoop 1.2.1 安裝 (CentOS 6.4 x64)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=80035

[研究] 雲端軟體 Hadoop 1.0.0 安裝 (CentOS 6.2 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=21166

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.5 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=18513

[研究] 雲端軟體 Hadoop 0.20.2 安裝 (CentOS 5.4 x86)
http://forum.icst.org.tw/phpbb/viewtopic.php?t=17974

浮雲雅築

2013年10月19日星期六

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

沒有留言:

張貼留言

2013年10月19日 星期六

[研究] Hadoop 1.2.1 (rpm)安裝 (CentOS 6.4 x64)

沒有留言:

張貼留言

2013年10月19日星期六