首页 > 代码库 > Hadoop1.1 和Hadoop 2.4 集群安装版

Hadoop1.1 和Hadoop 2.4 集群安装版


目录

1 环境准备... 4

1.1 硬件配置... 4

1.2 软件... 4

1.3 网络拓扑结构... 4

1.4 系统分区... 5

1.5 虚拟机配置... 5

1.6 SSH免密码登录... 5

1.7 JDK安装... 7

2 Hadoop1.2.1安装及配置... 8

2.1 下载Hadoop安装介质... 8

2.2 解压Hadoop安装介质... 9

2.3 Hadoop配置文件... 9

2.4 复制Hadoop安装介质... 10

2.5 配置HADOOP_HOME. 10

2.6 格式化 namenode. 11

2.7 启动Hadoop. 11

3 Hadoop1.2.1验证... 12

3.1 Hadoop 控制台... 12

3.2 Hadoop wordcount运行... 13

4 Hadoop2.4环境准备... 16

4.1 虚拟机准备... 16

4.2 网络拓扑结构... 16

4.3 系统分区... 17

4.4 虚拟机配置... 17

4.5 SSH免密码登录配置... 17

4.6 JDK安装... 17

5 Hadoop2.4安装及配置... 19

5.1 下载Hadoop安装介质... 19

5.2 解压Hadoop安装介质... 20

5.3 编译Hadoop本地库... 21

5.3.1 编译环境准备... 21

5.3.2 编译Hadoop. 23

5.3.3 编译总结... 32

5.4 Hadoop 配置文件... 33

5.5 复制Hadoop安装介质... 35

5.6 格式化NameNode. 35

5.7 启动Hadoop. 37

5.8 检查Hadoop启动情况... 38

5.8.1 检查namenode是否启动... 38

5.8.2 编译Hadoop. 39

6 Hadoop2.4验证... 39

6.1 Hadoop 控制台... 39

6.2 Hadoop wordcount运行... 40

6.2.1 创建文件夹及将本地文件复制到:hdfs系统中... 40

6.2.1 运行wordcount程序... 41

7 Hadoop安装错误... 42

7.1 hadoop2.2.0 无法连接ResourceManager问题... 42


1 环境准备

1.1 硬件配置

Dell 960 CPU英特尔 酷睿2 四核 Q8300 @ 2.50GHz

内存:4GB

硬盘:320GB

安装Window7,安装虚拟机

1.2 软件

ü CentOS6.5

ü VMware Workstation 10.0.2

ü Secure CRT 7.0

ü WinSCP 5.5.3

ü JDK 1.6.0_43

ü Hadoop1.2.1

ü Eclipse3.6

ü Window7

1.3 网络拓扑结构

192.168.1.53 namenode53

192.168.1.54 datanode54

192.168.1.55 datanode55

192.168.1.56 datanode56

1.4 系统分区

/dev/sda6 4.0G 380M 3.4G 10% /

tmpfs 495M 72K 495M 1% /dev/shm

/dev/sda2 7.9G 419M 7.1G 6% /app

/dev/sda3 7.9G 146M 7.4G 2% /applog

/dev/sda1 194M 30M 155M 16% /boot

/dev/sda5 5.8G 140M 5.3G 3% /data

/dev/sda8 2.0G 129M 1.8G 7% /home

/dev/sda9 2.0G 68M 1.9G 4% /opt

/dev/sda12 2.0G 36M 1.9G 2% /tmp

/dev/sda7 4.0G 3.3G 509M 87% /usr

/dev/sda10 2.0G 397M 1.5G 21% /var

1.5 虚拟机配置

4台虚拟机,配置为:1 CPU*1GBMEM*40GB硬盘 安装CentOS6.5 64位

图1.5-1

分别创建两个用户:root ,hadoop

root 为管理员用户

hadoop 为Hadoop运行用户

1.6 SSH免密码登录

一、创建在用户的home目录下创建  .ssh文件夹

mkdir .ssh

可以隐藏文件夹或文件内容

ls -a

二、 生成证书

证书分为:dsa和rsa

ssh-keygen -t rsa -P  ‘‘ -b 1024

ssh-keygen 生成命令

-t 表示证书 rsa

-p 密码提示语 ‘‘

-b 证书大小 为:1024

执行后 将会生成密钥文件和私钥文件

ll

-rwx------ 1 apch apache 883 May 20 15:13 id_rsa
-rwx------ 1 apch apache 224 May 20 15:13 id_rsa.pub

三、 把公钥信息写入 authorized_keys 文档中

cat  id_rsa.pub  >>  authorized_keys

(将生成的公钥文件写入 authorized_keys 文件)

四、设置文件和目录权限

设置authorized_keys权限
$ chmod 600 authorized_keys 
设置.ssh目录权限
$ chmod 700 -R .ssh

五 修改/etc/ssh/sshd_config  (需要使用root用户登录)

vi  /etc/ssh/sshd_config

Protocol 2 (仅使用SSH2) 
PermitRootLogin yes (允许root用户使用SSH登陆,根据登录账户设置)

ServerKeyBits 1024 (将serverkey的强度改为1024)

PasswordAuthentication no (不允许使用密码方式登陆)

PermitEmptyPasswords no   (禁止空密码进行登陆)

RSAAuthentication yes  (启用 RSA 认证)

PubkeyAuthentication yes (启用公钥认证)

AuthorizedKeysFile   .ssh/authorized_keys

六、重启sshd 服务 (需要使用root用户登录)

service sshd restart

七、本地验证测试

ssh -v  localhost (开启登录调试模式)

1.7 JDK安装

从Oracle官网下-jdk-6u43-linux-x64.bin

安装到:/usr/java目录

配置JAVA_HOME:

export JAVA_HOME=/usr/java/jdk1.6.0_43

export JAVA_BIN=/usr/java/jdk1.6.0_43/bin

export PATH=$PATH:$JAVA_HOME/bin

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export JAVA_HOME JAVA_BIN PATH CLASSPATH

生效配置文件

source /etc/profile

验证安装:java -version

2 Hadoop1.2.1安装及配置

2.1 下载Hadoop安装介质

下载地址:http://mirrors.hust.edu.cn/apache/hadoop/common/

并上传到Linux系统 rz(前置条件必须安装rz,sz包)

2.2 解压Hadoop安装介质

将hadoop安装到/app/hadoop/的目录下:

tar -zcvf /home/hadoop/hadoop-1.2.1.tar.gz

/app/hadoop/的权限为:hadoop:hadoop(用户、用户组)

2.3 Hadoop配置文件

需要配置修改配置文件如下:

ü hadoop-env.sh

ü core-site.xml

ü hdfs-site.xml

ü mapred-site.xml

ü masters

ü slaves

vi hadoop-env.sh

export JAVA_HOME/usr/java/jdk1.6.0_43

vi core-site.xml

vi hdfs-site.xml

添加以下内容:

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

vi mapred-site.xml

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>http://192.168.1.53:9001</value>

</property>

</configuration>

vi masters

192.168.1.53

vi slaves

192.168.1.54

192.168.1.55

192.168.1.56

2.4 复制Hadoop安装介质

经过2.3 节配置,将192.168.1.53服务上已配置完成hadoop文件,打包

tar -zcvf hadoop.tar.gz ./hadoop/

再将hadoop.tar.gz 复制到192.168.1.54-56

scp hadoop.tar.gz hadoop@192.168.1.54:/app/hadoop/

在分别登录到192.168.1.54~56 上,

cd /app/hadoop/

tar –zxvf hadoop.tar.gz

2.5 配置HADOOP_HOME

192.68.1.53-56 配置HADOOP_HOME

vi /etc/profile

# set hadoop path

export HADOOP_HOME=/app/hadoop/hadoop

export HADOOP_HOME_WARN_SUPPRESS=1

export PATH=$PATH:$HADOOP_HOME/bin

source /etc/profile

生效配置文件

export HADOOP_HOME_WARN_SUPPRESS=1

解决 启动Hadoop时报了一个警告信息

验证配置:

hadoop version

2.6 格式化 namenode

格式化HDFS文件系统

hadoop namenode -format

2.7 启动Hadoop

使用SecureCRT 7.0登录到192.168.1.53

命令:

cd /app/hadoop/hadoop/bin/

./start-all.sh

查看Hadoop的nameNode进程(192.168.1.53)

说明已启动NameNode,JobTracker,SecondaryNameNode

查看Hadoop的datanode进程:(192.168.1.54)

说明已启动:DataNode,TaskTracker

3 Hadoop1.2.1验证

3.1 Hadoop 控制台

http://192.168.1.53:50070/dfshealth.jsp

http://192.168.1.53:50030/jobtracker.jsp

3.2 Hadoop wordcount运行

hadoop jar hadoop-examples-1.2.1.jar wordcount /tmp/input /tmp/output

通过jobtracker控制台监控结果如下:

4 Hadoop2.4环境准备

4.1 虚拟机准备

从CentOS-53克隆CentOS-57-60

分别虚拟机名称、主机名、IP

CentOS-57 192.168.1.57 namenode57

CentOS-58 192.168.1.58 datanode58

CentOS-59 192.168.1.59 datanode59

CentOS-60 192.168.1.60 datanode60

(会使用VMware虚拟机的的克隆功能,快速的复制已安装好的系统。可是克隆完之后,会发现没有eth0网,切换到root用户下,才能修改,否则权限不足)

配置网卡:

vi /etc/sysconfig/network-scripts/ifcfg-eth0

修改网络卡配置-MAC地址

vi /etc/sysconfig/network-scripts/ifcfg-eth0

修改主机名:

vi /etc/sysconfig/network

重启系统,reboot

4.2 网络拓扑结构

参见1.3

192.168.1.57-60 修改hosts

添加以下内容:

192.168.1.57 namenode57

192.168.1.58 datanode58

192.168.1.59 datanode59

192.168.1.60 datanode60

su –

切换到root用户

vi /etc/hosts

192.168.1.57 namenode57

192.168.1.58 datanode58

192.168.1.59 datanode59

192.168.1.60 datanode60

4.3 系统分区

参见1.4

4.4 虚拟机配置

参见1.5

4.5 SSH免密码登录配置

参见1.6

4.6 JDK安装

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

jdk-7u55-linux-x64.rpm

安装JDK(192.168.1.57-60

将JDK7安装介质上传到192.168.1.57上“jdk-7u55-linux-x64.rpm”

rz

rpm -i jdk-7u55-linux-x64.rpm

安装成功

cd /usr/java

检查是否安装成功

安装目录:/usr/java/jdk1.7.0_55

配置JAVA_HOME

vi /etc/profile

添加以下内容

export JAVA_HOME=/usr/java/jdk1.7.0_55

export JAVA_BIN=/usr/java/jdk1.7.0_55/bin

export PATH=$PATH:$JAVA_HOME/bin

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export JAVA_HOME JAVA_BIN PATH CLASSPATH

生效配置文件

source /etc/profile

检查安装版本:

java -version

表示安装成功

5 Hadoop2.4安装及配置

5.1 下载Hadoop安装介质

http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/

并上传到Linux系统 rz命令上传(前置条件必须安装rz,sz包)

登录到192.168.1.57

5.2 解压Hadoop安装介质

安装目录为:/app/hadoop/hadoop2.4

分别在192.168.1.57-60服务器,以hadoop用户登录,并创建目录:hadoop2.4

cd /app/hadoop/

tar -zxvf /home/hadoop/hadoop-2.4.0.tar.gz

注:Hadoop-2.4.0 变化太大了,配置目录都变了

cd /app/hadoop/hadoop-2.4.0/lib/native

file libhadoop.so.1.0.0

libhadoop.so.1.0.0: ELF 32-bit LSB shared object,

环境变量配置:

su - (切换到root下)

vi /etc/profile

添加以下内容:

export HADOOP_HOME=/app/hadoop/hadoop-2.4.0

export HADOOP_HOME_WARN_SUPPRESS=1

export PATH=$PATH:$HADOOP_HOME/bin

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

生效配置文件

source /etc/profile

5.3 编译Hadoop本地库

5.3.1 编译环境准备

基于CentOS6.5 64位操作编译

主要涉及到工具有:hadoop-2.4.0-src.tar.gz、Ant、Maven、JDK、GCC、CMake、openssl

第一步升级系统相关编译所需的软件(升级最新版):

yum install lzo-devel zlib-devel gcc autoconf automake libtool ncurses-devel openssl-devel

wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/hadoop-2.4.0-src.tar.gz (源代版)

tar -zxvf hadoop-2.4.0-src.tar.gz

wget http://apache.fayea.com/apache-mirror//ant/binaries/apache-ant-1.9.4-bin.tar.gz

tar -xvf apache-ant-1.9.4-bin.tar.gz

wget http://apache.fayea.com/apache-mirror/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz

tar -xvf apache-maven-3.0.5-bin.tar.gz

vi /etc/profile

export JAVA_HOME=/usr/java/jdk1.7.0_55

export JAVA_BIN=/usr/java/jdk1.7.0_55/bin

export ANT_HOME=/home/hadoop/ant

export MVN_HOME=/home/hadoop/maven

export FINDBUGS_HOME=/home/hadoop/findbugs-2.0.3

export PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin:$MVN_HOME/bin:$FINDBUGS_HOME/bin

生产配置文件:

source /etc/profile

验证是否配置成功

ant –version

mvn -version

findbugs –version

验证结果:

安装protobuf(以root用户登录)

wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz

tar zxf protobuf-2.5.0.tar.gz

cd protobuf-2.5.0

./configure

make

make install

protoc --version

安装cmake(以root用户登录)

wget http://www.cmake.org/files/v2.8/cmake-2.8.12.2-Linux-i386.tar.gz

./bootstrap

make

make install

cmake –version

为了加速编译,将maven镜像库指向:开源中国

cd maven/conf

vi settings.xml

添加以下内容:

<mirror>

<id>nexus-osc</id>

<mirrorOf>*</mirrorOf>

<name>Nexus osc</name>

<url>http://maven.oschina.net/content/groups/public/</url>

</mirror>

<mirror>

<id>nexus-osc-thirdparty</id>

<mirrorOf>thirdparty</mirrorOf>

<name>Nexus osc thirdparty</name>

<url>http://maven.oschina.net/content/repositories/thirdparty/</url>

</mirror>

<profile>

<id>jdk-1.4</id>

<activation>

<jdk>1.4</jdk>

</activation>

<repositories>

<repository>

<id>nexus</id>

<name>local private nexus</name>

<url>http://maven.oschina.net/content/groups/public/</url>

<releases>

<enabled>true</enabled>

</releases>

<snapshots>

<enabled>false</enabled>

</snapshots>

</repository>

</repositories>

<pluginRepositories>

<pluginRepository>

<id>nexus</id>

<name>local private nexus</name>

<url>http://maven.oschina.net/content/groups/public/</url>

<releases>

<enabled>true</enabled>

</releases>

<snapshots>

<enabled>false</enabled>

</snapshots>

</pluginRepository>

</pluginRepositories>

</profile>

详细说明可参见:

http://maven.oschina.net/help.html

5.3.2 编译Hadoop

mvn package -DskipTests -Pdist,native –Dtar

此时在下载maven依赖所有包及插件

慢慢等待中……(花6个小时,终于看到一编译错误)

编译成功,检查nativelib 是否编译成功

cd hadoop-dist/target/hadoop-2.4.0/lib/native

file libhadoop.so.1.0.0

libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

代表编译成功

错误1

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 46.796s

[INFO] Finished at: Wed Jun 04 13:28:37 CST 2014

[INFO] Final Memory: 36M/88M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal on project hadoop-common: Could not resolve dependencies for project org.apache.hadoop:hadoop-common:jar:2.4.0: Failure to find org.apache.commons:commons-compress:jar:1.4.1 in https://repository.apache.org/content/repositories/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots.https has elapsed or updates are forced -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

[ERROR]

[ERROR] After correcting the problems, you can resume the build with the command

[ERROR] mvn <goals> -rf :hadoop-common

解决方法:

根据上面日志提示说找不到“org.apache.commons:commons-compress:jar:1.4.1”,

直接将本地(Windows)包复制到Linux系统中,解决了。

错误2

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 2:16.693s

[INFO] Finished at: Wed Jun 04 13:56:31 CST 2014

[INFO] Final Memory: 48M/239M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-common: An Ant BuildException has occured: Execute failed: java.io.IOException: Cannot run program "cmake" (in directory "/home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/native"): error=2, 没有那个文件或目录

[ERROR] around Ant part ...<exec dir="/home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/native" executable="cmake" failonerror="true">... @ 4:133 in /home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/antrun/build-main.xml

[ERROR] -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

[ERROR]

[ERROR] After correcting the problems, you can resume the build with the command

[ERROR] mvn <goals> -rf :hadoop-common

解决方法:

是没有安装cmake导致的,再重新安装cmake;参考《5.3.1编译环境准备》

错误3

错误提示是找不到相应的文件和不能创建目录,在网上没有相关错误(根据自己经验修改目录权限为:775,让目录有创建文件或文件夹的权限,另外最好保证hadoop编译目录有2.5G至4G的空间)

chmod -Rf 775 ./ hadoop-2.4.0-src

main:

[mkdir] Created dir: /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/test-dir

[INFO] Executed tasks

[INFO]

[INFO] --- maven-antrun-plugin:1.7:run (make) @ hadoop-pipes ---

[INFO] Executing tasks

错误3

main:

[mkdir] Created dir: /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native

[exec] -- The C compiler identification is GNU 4.4.7

[exec] -- The CXX compiler identification is GNU 4.4.7

[exec] -- Check for working C compiler: /usr/bin/cc

[exec] -- Check for working C compiler: /usr/bin/cc -- works

[exec] -- Detecting C compiler ABI info

[exec] -- Detecting C compiler ABI info - done

[exec] -- Check for working CXX compiler: /usr/bin/c++

[exec] -- Check for working CXX compiler: /usr/bin/c++ -- works

[exec] -- Detecting CXX compiler ABI info

[exec] -- Detecting CXX compiler ABI info - done

[exec] CMake Error at /usr/local/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:108 (message):

[exec] Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the

[exec] system variable OPENSSL_ROOT_DIR (missing: OPENSSL_LIBRARIES

[exec] OPENSSL_INCLUDE_DIR)

[exec] Call Stack (most recent call first):

[exec] /usr/local/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:315 (_FPHSA_FAILURE_MESSAGE)

[exec] /usr/local/share/cmake-2.8/Modules/FindOpenSSL.cmake:313 (find_package_handle_standard_args)

[exec] CMakeLists.txt:20 (find_package)

[exec]

[exec]

[exec] -- Configuring incomplete, errors occurred!

[exec] See also "/data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native/CMakeFiles/CMakeOutput.log".

[INFO] ------------------------------------------------------------------------

[INFO] Reactor Summary:

[INFO]

[INFO] Apache Hadoop Main ................................ SUCCESS [13.745s]

[INFO] Apache Hadoop Project POM ......................... SUCCESS [5.538s]

[INFO] Apache Hadoop Annotations ......................... SUCCESS [7.296s]

[INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.568s]

[INFO] Apache Hadoop Project Dist POM .................... SUCCESS [5.858s]

[INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [8.541s]

[INFO] Apache Hadoop MiniKDC ............................. SUCCESS [8.337s]

[INFO] Apache Hadoop Auth ................................ SUCCESS [7.348s]

[INFO] Apache Hadoop Auth Examples ....................... SUCCESS [4.926s]

[INFO] Apache Hadoop Common .............................. SUCCESS [2:35.956s]

[INFO] Apache Hadoop NFS ................................. SUCCESS [18.680s]

[INFO] Apache Hadoop Common Project ...................... SUCCESS [0.059s]

[INFO] Apache Hadoop HDFS ................................ SUCCESS [5:03.525s]

[INFO] Apache Hadoop HttpFS .............................. SUCCESS [38.335s]

[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [23.780s]

[INFO] Apache Hadoop HDFS-NFS ............................ SUCCESS [8.769s]

[INFO] Apache Hadoop HDFS Project ........................ SUCCESS [0.159s]

[INFO] hadoop-yarn ....................................... SUCCESS [0.134s]

[INFO] hadoop-yarn-api ................................... SUCCESS [2:07.657s]

[INFO] hadoop-yarn-common ................................ SUCCESS [1:10.680s]

[INFO] hadoop-yarn-server ................................ SUCCESS [0.165s]

[INFO] hadoop-yarn-server-common ......................... SUCCESS [24.174s]

[INFO] hadoop-yarn-server-nodemanager .................... SUCCESS [27.293s]

[INFO] hadoop-yarn-server-web-proxy ...................... SUCCESS [5.177s]

[INFO] hadoop-yarn-server-applicationhistoryservice ...... SUCCESS [11.399s]

[INFO] hadoop-yarn-server-resourcemanager ................ SUCCESS [28.384s]

[INFO] hadoop-yarn-server-tests .......................... SUCCESS [1.346s]

[INFO] hadoop-yarn-client ................................ SUCCESS [12.937s]

[INFO] hadoop-yarn-applications .......................... SUCCESS [0.108s]

[INFO] hadoop-yarn-applications-distributedshell ......... SUCCESS [5.303s]

[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [3.212s]

[INFO] hadoop-yarn-site .................................. SUCCESS [0.050s]

[INFO] hadoop-yarn-project ............................... SUCCESS [8.638s]

[INFO] hadoop-mapreduce-client ........................... SUCCESS [0.135s]

[INFO] hadoop-mapreduce-client-core ...................... SUCCESS [43.622s]

[INFO] hadoop-mapreduce-client-common .................... SUCCESS [36.329s]

[INFO] hadoop-mapreduce-client-shuffle ................... SUCCESS [6.058s]

[INFO] hadoop-mapreduce-client-app ....................... SUCCESS [20.058s]

[INFO] hadoop-mapreduce-client-hs ........................ SUCCESS [16.493s]

[INFO] hadoop-mapreduce-client-jobclient ................. SUCCESS [11.685s]

[INFO] hadoop-mapreduce-client-hs-plugins ................ SUCCESS [3.222s]

[INFO] Apache Hadoop MapReduce Examples .................. SUCCESS [12.656s]

[INFO] hadoop-mapreduce .................................. SUCCESS [8.060s]

[INFO] Apache Hadoop MapReduce Streaming ................. SUCCESS [8.994s]

[INFO] Apache Hadoop Distributed Copy .................... SUCCESS [15.886s]

[INFO] Apache Hadoop Archives ............................ SUCCESS [6.659s]

[INFO] Apache Hadoop Rumen ............................... SUCCESS [15.722s]

[INFO] Apache Hadoop Gridmix ............................. SUCCESS [11.778s]

[INFO] Apache Hadoop Data Join ........................... SUCCESS [5.953s]

[INFO] Apache Hadoop Extras .............................. SUCCESS [6.414s]

[INFO] Apache Hadoop Pipes ............................... FAILURE [3.746s]

[INFO] Apache Hadoop OpenStack support ................... SKIPPED

[INFO] Apache Hadoop Client .............................. SKIPPED

[INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED

[INFO] Apache Hadoop Scheduler Load Simulator ............ SKIPPED

[INFO] Apache Hadoop Tools Dist .......................... SKIPPED

[INFO] Apache Hadoop Tools ............................... SKIPPED

[INFO] Apache Hadoop Distribution ........................ SKIPPED

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 19:43.155s

[INFO] Finished at: Wed Jun 04 17:40:17 CST 2014

[INFO] Final Memory: 79M/239M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1

[ERROR] around Ant part ...<exec dir="/data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native" executable="cmake" failonerror="true">... @ 5:123 in /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/antrun/build-main.xml

[ERROR] -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

[ERROR]

[ERROR] After correcting the problems, you can resume the build with the command

根据网上提示( 下面需要再安装openssl-devel,安装命令yum install openssl-devel,此步不做的话会报如下错误

[exec] CMake Error at /usr/share/cmake/Modules/FindOpenSSL.cmake:66 (MESSAGE):
[exec]   Could NOT find OpenSSL
[exec] Call Stack (most recent call first):
[exec]   CMakeLists.txt:20 (find_package)
[exec]
[exec]
[exec] -- Configuring incomplete, errors occurred!
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluen ... oExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-pipes

错误连接:http://f.dataguru.cn/thread-189176-1-1.html

原因是:在安装openssl-devel,少写一个l,重新安装一下

解决方法:重新安装openssl-devel

yum install openssl-devel

5.3.3 编译总结

1、 必须安装(yum install lzo-devel zlib-devel gcc autoconf automake libtool ncurses-devel openssl-devel)

2、 必须安装(protobuf,CMake)编译工具

3、 必须配置(ANT、MAVEN、FindBugs)

4、 将maven库指向开源中国,这样就可以加快编译速度,即加快下载依赖jar包速度

5、 编译出错需求详细观察出错日志,根据错误日志分析原因再结束百度和Google解决错误;

5.4 Hadoop 配置文件

cd hadoop-2.4.0

cd etc/hadoop

core-site.xml

yarn-site.xml

hdfs-site.xml

mapred-site.xml

hadoop-env.sh

vi core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://namenode57:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/app/hadoop/current/tmp</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/app/hadoop/current/data</value>

</property>

</configuration>

vi yarn-site.xml

<property>

<name>yarn.resourcemanager.address</name>

<value>namenode57:18040</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>namenode57:18030</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>namenode57:18088</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>namenode57:18025</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>namenode57:18141</value>

</property>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

vi hdfs-site.xml

<configuration>

<property>

<name>dfs.namenode.rpc-address</name>

<value>namenode57:9001</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>/app/hadoop/current/dfs</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/app/hadoop/current/data</value>

</property>

</configuration>

vi mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

vi hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_55

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

5.5 复制Hadoop安装介质

scp hadoop-2.4.0.tar.gz hadoop@192.168.1.58:/app/hadoop/

分别复制到:192.168.1.58-60

5.6 格式化NameNode

./hdfs namenode -format

5.7 启动Hadoop

cd /app/hadoop/hadoop-2.4.0/

./start-all.sh

查看进程hadoop进程:

jps

ps –ef|grep java

14/06/04 07:48:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on

启动时一直报错,准备编译Hadoop64位依赖包(5.3编译Hadoop本地库),再进行替换

替换掉32位的native库

删除原32位的native库

cd /app/hadoop/hadoop-2.4.0/lib

rm -rf native/

将5.3节编译好native 64位的库复制到:/app/hadoop/hadoop-2.4.0/lib

cd /data/hadoop/hadoop-2.4.0-src/hadoop-dist/target/hadoop-2.4.0/lib

cp -r ./native /app/hadoop/hadoop-2.4.0/lib/

错误1:

2014-06-04 18:30:57,450 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager

java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:98)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)

at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:220)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)

at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)

at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:186)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)

at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:357)

at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)

2014-06-04 18:30:57,458 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:

解决方法:

vi /app/hadoop/hadoop-2.4.0/etc/hadoop/yarn-site.xml

<property> 

    <name>yarn.nodemanager.aux-services</name> 

    <value>mapreduce.shuffle</value> 

  </property>

改为

<property> 

    <name>yarn.nodemanager.aux-services</name> 

    <value>mapreduce_shuffle</value> 

  </property>

总结:

从2.0以后启动和停止hadoop的命令start-all.sh和stop-all.sh不建议使用,开始摒弃掉取而代之的将使用start-dfs.sh和start-yarn.sh启动hadoop,详细请看官方说明。

5.8 检查Hadoop启动情况

5.8.1 检查namenode是否启动

在192.168.1.57检查

jps

5.8.2 编译Hadoop

分别检查192.168.1.58~60

Jps

6 Hadoop2.4验证

6.1 Hadoop 控制台

HDFS

http://192.168.1.57:50070/dfshealth.html#tab-overview

http://namenode57:8088/cluster/

6.2 Hadoop wordcount运行

6.2.1 创建文件夹及将本地文件复制到:hdfs系统中

1、创建/tmp/input文件夹

hadoop fs -mkdir /tmp

hadoop fs -mkdir /tmp/input

2、将本地文件复制到hdfs系统中

hadoop fs -put /usr/hadoop/file* /tmp/input

3、查看test.txt文件是否成功上传到hdfs上

hadoop fs -ls /tmp/input

6.2.1 运行wordcount程序

hadoop jar hadoop-mapreduce-examples-2.4.0.jar wordcount /tmp/input /tmp/output

7 Hadoop安装错误

7.1 hadoop2.4无法连接ResourceManager问题

错误日志:

ountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:07,154 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:08,156 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:09,159 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:10,161 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:11,164 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:12,166 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:13,169 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-06-05 07:01:14,171 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

解决方法,在yarn.xml文件里配置,所有机器都修改

<property>

<name>yarn.resourcemanager.hostname</name>

<value>namenode57</value>

</property>