首页 > 代码库 > hadoop2.3安装过程及问题解决

hadoop2.3安装过程及问题解决

三台服务器yiprod01,02,03,其中01为namenode,02为secondarynamenode,3个均为datanode

3台服务器的这里提到的配置均需一样。

0、安装前提条件:

0.1 确保有java

安装完java后,在.bash_profile中,必须有JAVA_HOME配置

export JAVA_HOME=/home/yimr/local/jdk


0.2 确保3台机器建立信任关系,详见另一篇文章


1、core-site.xml

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/home/sdc/tmp/hadoop-${user.name}</value>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://yiprod01:9000</value>
    </property>
</configuration>


2、hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
         <name>dfs.namenode.secondary.http-address</name>
         <value><span style="font-family: Arial, Helvetica, sans-serif;">yiprod02</span><span style="font-family: Arial, Helvetica, sans-serif;">:9001</value></span>    </property>
    <property>
         <name>dfs.namenode.name.dir</name>
         <value>file:/home/yimr/dfs/name</value>
    </property>
    <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:/home/yimr/dfs/data</value>
    </property>
    <property>
         <name>dfs.replication</name>
         <value>3</value>
    </property>
    <property>
         <name>dfs.webhdfs.enabled</name>
         <value>true</value>
    </property>
</configuration>

3、hadoop-env.sh

export JAVA_HOME=/usr/local/jdk1.6.0_27

4、mapred-site.xml

<configuration>
    <property>
        <!-- 使用yarn作为资源分配和任务管理框架 -->
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <!-- JobHistory Server地址 -->
        <name>mapreduce.jobhistory.address</name>
        <value>yiprod01:10020</value>
    </property>
    <property>
        <!-- JobHistory WEB地址 -->
        <name>mapreduce.jobhistory.webapp.address</name>
        <value><span style="font-family: Arial, Helvetica, sans-serif;">yiprod01</span><span style="font-family: Arial, Helvetica, sans-serif;">:19888</value></span>    </property>
    <property>
        <!-- 排序文件的时候一次同时最多可并行的个数 -->
        <name>mapreduce.task.io.sort.factor</name>
        <value>100</value>
    </property>
    <property>
ll        <name>mapreduce.reduce.shuffle.parallelcopies</name>
        <value>50</value>
    </property>
    <property>
        <name>mapred.system.dir</name>
        <value>file:/home/yimr/dfs/mr/system</value>
    </property>
    <property>
        <name>mapred.local.dir</name>
        <value>file:/home/sdc/dfs/mr/local</value>
    </property>
    <property>
        <!-- 每个Map Task需要向RM申请的内存量 -->
        <name>mapreduce.map.memory.mb</name>
        <value>1536</value>
    </property>
    <property>
        <!-- 每个Map阶段申请的Container的JVM参数 -->
        <name>mapreduce.map.java.opts</name>
        <value>-Xmx1024M</value>
    </property>
    <property>
        <!-- 每个Reduce Task需要向RM申请的内存量 -->
        <name>mapreduce.reduce.memory.mb</name>
        <value>2048</value>
    </property>
    <property>
        <!-- 每个Reduce阶段申请的Container的JVM参数 -->
        <name>mapreduce.reduce.java.opts</name>
        <value>-Xmx1536M</value>
    </property>
    <property>
        <!-- 排序内存使用限制 -->
        <name>mapreduce.task.io.sort.mb</name>
        <value>512</value>
    </property>
</configuration>

5、yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>yiprod01:8080</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>yiprod01:8081</value>
    </property>
    <property>        
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>yiprod01:8082</value>
    </property>
    <property>
        <!-- 每个nodemanager可分配的内存总量 -->
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>2048</value>
    </property>
    <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>${hadoop.tmp.dir}/nodemanager/remote</value>
    </property>
    <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value>${hadoop.tmp.dir}/nodemanager/logs</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>yiprod01:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>yiprod01:8088</value>
    </property>
</configuration>

6、format namenode

java.io.IOException: NameNode is not formatted.
hadoop namenode -format


7、问题解决

7.1 32位库问题

表现:

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
14/08/01 11:59:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/yimr/local/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
yiprod01]
sed: -e expression #1, char 6: unknown option to `s'
-c: Unknown cipher type 'cd'
The authenticity of host 'yiprod01 (192.168.1.131)' can't be established.
RSA key fingerprint is ac:9e:e0:db:d8:7a:29:5c:a1:d4:7f:4c:38:c0:72:30.
Are you sure you want to continue connecting (yes/no)? 64-Bit: ssh: Could not resolve hostname 64-Bit: Name or service not known
You: ssh: Could not resolve hostname You: Name or service not known
VM: ssh: Could not resolve hostname VM: Name or service not known
loaded: ssh: Could not resolve hostname loaded: Name or service not known
have: ssh: Could not resolve hostname have: Name or service not known
HotSpot(TM): ssh: Could not resolve hostname HotSpot(TM): Name or service not known
Server: ssh: Could not resolve hostname Server: Name or service not known
guard.: ssh: Could not resolve hostname guard.: Name or service not known
原因是使用了下载hadoop时,默认编译的32位的库

file libhadoop.so.1.0.0 

libhadoop.so.1.0.0: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped

临时解决办法:

修改etc下面的hadoop-env.sh

在末尾加上如下两行

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.library.path=$HADOOP_PREFIX/lib"

但仍然有以下warning

14/08/01 11:46:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


至此hadoop可以正常启动起来,在单独的一篇文章介绍如何彻底解决此问题。