首页 > 代码库 > Hadoop1.1 和Hadoop 2.4 集群安装版
Hadoop1.1 和Hadoop 2.4 集群安装版
目录
1 环境准备... 4
1.1 硬件配置... 4
1.2 软件... 4
1.3 网络拓扑结构... 4
1.4 系统分区... 5
1.5 虚拟机配置... 5
1.6 SSH免密码登录... 5
1.7 JDK安装... 7
2 Hadoop1.2.1安装及配置... 8
2.1 下载Hadoop安装介质... 8
2.2 解压Hadoop安装介质... 9
2.3 Hadoop配置文件... 9
2.4 复制Hadoop安装介质... 10
2.5 配置HADOOP_HOME. 10
2.6 格式化 namenode. 11
2.7 启动Hadoop. 11
3 Hadoop1.2.1验证... 12
3.1 Hadoop 控制台... 12
3.2 Hadoop wordcount运行... 13
4 Hadoop2.4环境准备... 16
4.1 虚拟机准备... 16
4.2 网络拓扑结构... 16
4.3 系统分区... 17
4.4 虚拟机配置... 17
4.5 SSH免密码登录配置... 17
4.6 JDK安装... 17
5 Hadoop2.4安装及配置... 19
5.1 下载Hadoop安装介质... 19
5.2 解压Hadoop安装介质... 20
5.3 编译Hadoop本地库... 21
5.3.1 编译环境准备... 21
5.3.2 编译Hadoop. 23
5.3.3 编译总结... 32
5.4 Hadoop 配置文件... 33
5.5 复制Hadoop安装介质... 35
5.6 格式化NameNode. 35
5.7 启动Hadoop. 37
5.8 检查Hadoop启动情况... 38
5.8.1 检查namenode是否启动... 38
5.8.2 编译Hadoop. 39
6 Hadoop2.4验证... 39
6.1 Hadoop 控制台... 39
6.2 Hadoop wordcount运行... 40
6.2.1 创建文件夹及将本地文件复制到:hdfs系统中... 40
6.2.1 运行wordcount程序... 41
7 Hadoop安装错误... 42
7.1 hadoop2.2.0 无法连接ResourceManager问题... 42
1 环境准备
1.1 硬件配置
Dell 960 CPU英特尔 酷睿2 四核 Q8300 @ 2.50GHz
内存:4GB
硬盘:320GB
安装Window7,安装虚拟机
1.2 软件
ü CentOS6.5
ü VMware Workstation 10.0.2
ü Secure CRT 7.0
ü WinSCP 5.5.3
ü JDK 1.6.0_43
ü Hadoop1.2.1
ü Eclipse3.6
ü Window7
1.3 网络拓扑结构
192.168.1.53 namenode53
192.168.1.54 datanode54
192.168.1.55 datanode55
192.168.1.56 datanode56
1.4 系统分区
/dev/sda6 4.0G 380M 3.4G 10% /
tmpfs 495M 72K 495M 1% /dev/shm
/dev/sda2 7.9G 419M 7.1G 6% /app
/dev/sda3 7.9G 146M 7.4G 2% /applog
/dev/sda1 194M 30M 155M 16% /boot
/dev/sda5 5.8G 140M 5.3G 3% /data
/dev/sda8 2.0G 129M 1.8G 7% /home
/dev/sda9 2.0G 68M 1.9G 4% /opt
/dev/sda12 2.0G 36M 1.9G 2% /tmp
/dev/sda7 4.0G 3.3G 509M 87% /usr
/dev/sda10 2.0G 397M 1.5G 21% /var
1.5 虚拟机配置
4台虚拟机,配置为:1 CPU*1GBMEM*40GB硬盘 安装CentOS6.5 64位
图1.5-1
分别创建两个用户:root ,hadoop
root 为管理员用户
hadoop 为Hadoop运行用户
1.6 SSH免密码登录
一、创建在用户的home目录下创建 .ssh文件夹
mkdir .ssh
可以隐藏文件夹或文件内容
ls -a
二、 生成证书
证书分为:dsa和rsa
ssh-keygen -t rsa -P ‘‘ -b 1024
ssh-keygen 生成命令
-t 表示证书 rsa
-p 密码提示语 ‘‘
-b 证书大小 为:1024
执行后 将会生成密钥文件和私钥文件
ll
-rwx------ 1 apch apache 883 May 20 15:13 id_rsa
-rwx------ 1 apch apache 224 May 20 15:13 id_rsa.pub
三、 把公钥信息写入 authorized_keys 文档中
cat id_rsa.pub >> authorized_keys
(将生成的公钥文件写入 authorized_keys 文件)
四、设置文件和目录权限
设置authorized_keys权限
$ chmod 600 authorized_keys
设置.ssh目录权限
$ chmod 700 -R .ssh
五 修改/etc/ssh/sshd_config (需要使用root用户登录)
vi /etc/ssh/sshd_config
Protocol 2 (仅使用SSH2)
PermitRootLogin yes (允许root用户使用SSH登陆,根据登录账户设置)
ServerKeyBits 1024 (将serverkey的强度改为1024)
PasswordAuthentication no (不允许使用密码方式登陆)
PermitEmptyPasswords no (禁止空密码进行登陆)
RSAAuthentication yes (启用 RSA 认证)
PubkeyAuthentication yes (启用公钥认证)
AuthorizedKeysFile .ssh/authorized_keys
六、重启sshd 服务 (需要使用root用户登录)
service sshd restart
七、本地验证测试
ssh -v localhost (开启登录调试模式)
1.7 JDK安装
从Oracle官网下-jdk-6u43-linux-x64.bin
安装到:/usr/java目录
配置JAVA_HOME:
export JAVA_HOME=/usr/java/jdk1.6.0_43
export JAVA_BIN=/usr/java/jdk1.6.0_43/bin
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATH
生效配置文件
source /etc/profile
验证安装:java -version
2 Hadoop1.2.1安装及配置
2.1 下载Hadoop安装介质
下载地址:http://mirrors.hust.edu.cn/apache/hadoop/common/
并上传到Linux系统 rz(前置条件必须安装rz,sz包)
2.2 解压Hadoop安装介质
将hadoop安装到/app/hadoop/的目录下:
tar -zcvf /home/hadoop/hadoop-1.2.1.tar.gz
/app/hadoop/的权限为:hadoop:hadoop(用户、用户组)
2.3 Hadoop配置文件
需要配置修改配置文件如下:
ü hadoop-env.sh
ü core-site.xml
ü hdfs-site.xml
ü mapred-site.xml
ü masters
ü slaves
vi hadoop-env.sh
export JAVA_HOME/usr/java/jdk1.6.0_43
vi core-site.xml
vi hdfs-site.xml
添加以下内容:
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
vi mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>http://192.168.1.53:9001</value>
</property>
</configuration>
vi masters
192.168.1.53
vi slaves
192.168.1.54
192.168.1.55
192.168.1.56
2.4 复制Hadoop安装介质
经过2.3 节配置,将192.168.1.53服务上已配置完成hadoop文件,打包
tar -zcvf hadoop.tar.gz ./hadoop/
再将hadoop.tar.gz 复制到192.168.1.54-56
scp hadoop.tar.gz hadoop@192.168.1.54:/app/hadoop/
在分别登录到192.168.1.54~56 上,
cd /app/hadoop/
tar –zxvf hadoop.tar.gz
2.5 配置HADOOP_HOME
192.68.1.53-56 配置HADOOP_HOME
vi /etc/profile
# set hadoop path
export HADOOP_HOME=/app/hadoop/hadoop
export HADOOP_HOME_WARN_SUPPRESS=1
export PATH=$PATH:$HADOOP_HOME/bin
source /etc/profile
生效配置文件
export HADOOP_HOME_WARN_SUPPRESS=1
解决 启动Hadoop时报了一个警告信息
验证配置:
hadoop version
2.6 格式化 namenode
格式化HDFS文件系统
hadoop namenode -format
2.7 启动Hadoop
使用SecureCRT 7.0登录到192.168.1.53
命令:
cd /app/hadoop/hadoop/bin/
./start-all.sh
查看Hadoop的nameNode进程(192.168.1.53)
说明已启动NameNode,JobTracker,SecondaryNameNode
查看Hadoop的datanode进程:(192.168.1.54)
说明已启动:DataNode,TaskTracker
3 Hadoop1.2.1验证
3.1 Hadoop 控制台
http://192.168.1.53:50070/dfshealth.jsp
http://192.168.1.53:50030/jobtracker.jsp
3.2 Hadoop wordcount运行
hadoop jar hadoop-examples-1.2.1.jar wordcount /tmp/input /tmp/output
通过jobtracker控制台监控结果如下:
4 Hadoop2.4环境准备
4.1 虚拟机准备
从CentOS-53克隆CentOS-57-60
分别虚拟机名称、主机名、IP
CentOS-57 192.168.1.57 namenode57
CentOS-58 192.168.1.58 datanode58
CentOS-59 192.168.1.59 datanode59
CentOS-60 192.168.1.60 datanode60
(会使用VMware虚拟机的的克隆功能,快速的复制已安装好的系统。可是克隆完之后,会发现没有eth0网,切换到root用户下,才能修改,否则权限不足)
配置网卡:
vi /etc/sysconfig/network-scripts/ifcfg-eth0
修改网络卡配置-MAC地址
vi /etc/sysconfig/network-scripts/ifcfg-eth0
修改主机名:
vi /etc/sysconfig/network
重启系统,reboot
4.2 网络拓扑结构
参见1.3
192.168.1.57-60 修改hosts
添加以下内容:
192.168.1.57 namenode57
192.168.1.58 datanode58
192.168.1.59 datanode59
192.168.1.60 datanode60
su –
切换到root用户
vi /etc/hosts
192.168.1.57 namenode57
192.168.1.58 datanode58
192.168.1.59 datanode59
192.168.1.60 datanode60
4.3 系统分区
参见1.4
4.4 虚拟机配置
参见1.5
4.5 SSH免密码登录配置
参见1.6
4.6 JDK安装
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
jdk-7u55-linux-x64.rpm
安装JDK(192.168.1.57-60)
将JDK7安装介质上传到192.168.1.57上“jdk-7u55-linux-x64.rpm”
rz
rpm -i jdk-7u55-linux-x64.rpm
安装成功
cd /usr/java
检查是否安装成功
安装目录:/usr/java/jdk1.7.0_55
配置JAVA_HOME
vi /etc/profile
添加以下内容
export JAVA_HOME=/usr/java/jdk1.7.0_55
export JAVA_BIN=/usr/java/jdk1.7.0_55/bin
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATH
生效配置文件
source /etc/profile
检查安装版本:
java -version
表示安装成功
5 Hadoop2.4安装及配置
5.1 下载Hadoop安装介质
http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/
并上传到Linux系统 rz命令上传(前置条件必须安装rz,sz包)
登录到192.168.1.57
5.2 解压Hadoop安装介质
安装目录为:/app/hadoop/hadoop2.4
分别在192.168.1.57-60服务器,以hadoop用户登录,并创建目录:hadoop2.4
cd /app/hadoop/
tar -zxvf /home/hadoop/hadoop-2.4.0.tar.gz
注:Hadoop-2.4.0 变化太大了,配置目录都变了
cd /app/hadoop/hadoop-2.4.0/lib/native
file libhadoop.so.1.0.0
libhadoop.so.1.0.0: ELF 32-bit LSB shared object,
环境变量配置:
su - (切换到root下)
vi /etc/profile
添加以下内容:
export HADOOP_HOME=/app/hadoop/hadoop-2.4.0
export HADOOP_HOME_WARN_SUPPRESS=1
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
生效配置文件
source /etc/profile
5.3 编译Hadoop本地库
5.3.1 编译环境准备
基于CentOS6.5 64位操作编译
主要涉及到工具有:hadoop-2.4.0-src.tar.gz、Ant、Maven、JDK、GCC、CMake、openssl
第一步升级系统相关编译所需的软件(升级最新版):
yum install lzo-devel zlib-devel gcc autoconf automake libtool ncurses-devel openssl-devel
wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.4.0/hadoop-2.4.0-src.tar.gz (源代版)
tar -zxvf hadoop-2.4.0-src.tar.gz
wget http://apache.fayea.com/apache-mirror//ant/binaries/apache-ant-1.9.4-bin.tar.gz
tar -xvf apache-ant-1.9.4-bin.tar.gz
wget http://apache.fayea.com/apache-mirror/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
tar -xvf apache-maven-3.0.5-bin.tar.gz
vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_55
export JAVA_BIN=/usr/java/jdk1.7.0_55/bin
export ANT_HOME=/home/hadoop/ant
export MVN_HOME=/home/hadoop/maven
export FINDBUGS_HOME=/home/hadoop/findbugs-2.0.3
export PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin:$MVN_HOME/bin:$FINDBUGS_HOME/bin
生产配置文件:
source /etc/profile
验证是否配置成功
ant –version
mvn -version
findbugs –version
验证结果:
安装protobuf(以root用户登录)
wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
tar zxf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0
./configure
make
make install
protoc --version
安装cmake(以root用户登录)
wget http://www.cmake.org/files/v2.8/cmake-2.8.12.2-Linux-i386.tar.gz
./bootstrap
make
make install
cmake –version
为了加速编译,将maven镜像库指向:开源中国
cd maven/conf
vi settings.xml
添加以下内容:
<mirror>
<id>nexus-osc</id>
<mirrorOf>*</mirrorOf>
<name>Nexus osc</name>
<url>http://maven.oschina.net/content/groups/public/</url>
</mirror>
<mirror>
<id>nexus-osc-thirdparty</id>
<mirrorOf>thirdparty</mirrorOf>
<name>Nexus osc thirdparty</name>
<url>http://maven.oschina.net/content/repositories/thirdparty/</url>
</mirror>
<profile>
<id>jdk-1.4</id>
<activation>
<jdk>1.4</jdk>
</activation>
<repositories>
<repository>
<id>nexus</id>
<name>local private nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>nexus</id>
<name>local private nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</profile>
详细说明可参见:
http://maven.oschina.net/help.html
5.3.2 编译Hadoop
mvn package -DskipTests -Pdist,native –Dtar
此时在下载maven依赖所有包及插件
慢慢等待中……(花6个小时,终于看到一编译错误)
编译成功,检查nativelib 是否编译成功
cd hadoop-dist/target/hadoop-2.4.0/lib/native
file libhadoop.so.1.0.0
libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
代表编译成功
错误1
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 46.796s
[INFO] Finished at: Wed Jun 04 13:28:37 CST 2014
[INFO] Final Memory: 36M/88M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hadoop-common: Could not resolve dependencies for project org.apache.hadoop:hadoop-common:jar:2.4.0: Failure to find org.apache.commons:commons-compress:jar:1.4.1 in https://repository.apache.org/content/repositories/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots.https has elapsed or updates are forced -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hadoop-common
解决方法:
根据上面日志提示说找不到“org.apache.commons:commons-compress:jar:1.4.1”,
直接将本地(Windows)包复制到Linux系统中,解决了。
错误2
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:16.693s
[INFO] Finished at: Wed Jun 04 13:56:31 CST 2014
[INFO] Final Memory: 48M/239M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-common: An Ant BuildException has occured: Execute failed: java.io.IOException: Cannot run program "cmake" (in directory "/home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/native"): error=2, 没有那个文件或目录
[ERROR] around Ant part ...<exec dir="/home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/native" executable="cmake" failonerror="true">... @ 4:133 in /home/hadoop/hadoop-2.4.0-src/hadoop-common-project/hadoop-common/target/antrun/build-main.xml
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hadoop-common
解决方法:
是没有安装cmake导致的,再重新安装cmake;参考《5.3.1编译环境准备》
错误3
错误提示是找不到相应的文件和不能创建目录,在网上没有相关错误(根据自己经验修改目录权限为:775,让目录有创建文件或文件夹的权限,另外最好保证hadoop编译目录有2.5G至4G的空间)
chmod -Rf 775 ./ hadoop-2.4.0-src
main:
[mkdir] Created dir: /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/test-dir
[INFO] Executed tasks
[INFO]
[INFO] --- maven-antrun-plugin:1.7:run (make) @ hadoop-pipes ---
[INFO] Executing tasks
错误3
main:
[mkdir] Created dir: /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native
[exec] -- The C compiler identification is GNU 4.4.7
[exec] -- The CXX compiler identification is GNU 4.4.7
[exec] -- Check for working C compiler: /usr/bin/cc
[exec] -- Check for working C compiler: /usr/bin/cc -- works
[exec] -- Detecting C compiler ABI info
[exec] -- Detecting C compiler ABI info - done
[exec] -- Check for working CXX compiler: /usr/bin/c++
[exec] -- Check for working CXX compiler: /usr/bin/c++ -- works
[exec] -- Detecting CXX compiler ABI info
[exec] -- Detecting CXX compiler ABI info - done
[exec] CMake Error at /usr/local/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:108 (message):
[exec] Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the
[exec] system variable OPENSSL_ROOT_DIR (missing: OPENSSL_LIBRARIES
[exec] OPENSSL_INCLUDE_DIR)
[exec] Call Stack (most recent call first):
[exec] /usr/local/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:315 (_FPHSA_FAILURE_MESSAGE)
[exec] /usr/local/share/cmake-2.8/Modules/FindOpenSSL.cmake:313 (find_package_handle_standard_args)
[exec] CMakeLists.txt:20 (find_package)
[exec]
[exec]
[exec] -- Configuring incomplete, errors occurred!
[exec] See also "/data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native/CMakeFiles/CMakeOutput.log".
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................ SUCCESS [13.745s]
[INFO] Apache Hadoop Project POM ......................... SUCCESS [5.538s]
[INFO] Apache Hadoop Annotations ......................... SUCCESS [7.296s]
[INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.568s]
[INFO] Apache Hadoop Project Dist POM .................... SUCCESS [5.858s]
[INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [8.541s]
[INFO] Apache Hadoop MiniKDC ............................. SUCCESS [8.337s]
[INFO] Apache Hadoop Auth ................................ SUCCESS [7.348s]
[INFO] Apache Hadoop Auth Examples ....................... SUCCESS [4.926s]
[INFO] Apache Hadoop Common .............................. SUCCESS [2:35.956s]
[INFO] Apache Hadoop NFS ................................. SUCCESS [18.680s]
[INFO] Apache Hadoop Common Project ...................... SUCCESS [0.059s]
[INFO] Apache Hadoop HDFS ................................ SUCCESS [5:03.525s]
[INFO] Apache Hadoop HttpFS .............................. SUCCESS [38.335s]
[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [23.780s]
[INFO] Apache Hadoop HDFS-NFS ............................ SUCCESS [8.769s]
[INFO] Apache Hadoop HDFS Project ........................ SUCCESS [0.159s]
[INFO] hadoop-yarn ....................................... SUCCESS [0.134s]
[INFO] hadoop-yarn-api ................................... SUCCESS [2:07.657s]
[INFO] hadoop-yarn-common ................................ SUCCESS [1:10.680s]
[INFO] hadoop-yarn-server ................................ SUCCESS [0.165s]
[INFO] hadoop-yarn-server-common ......................... SUCCESS [24.174s]
[INFO] hadoop-yarn-server-nodemanager .................... SUCCESS [27.293s]
[INFO] hadoop-yarn-server-web-proxy ...................... SUCCESS [5.177s]
[INFO] hadoop-yarn-server-applicationhistoryservice ...... SUCCESS [11.399s]
[INFO] hadoop-yarn-server-resourcemanager ................ SUCCESS [28.384s]
[INFO] hadoop-yarn-server-tests .......................... SUCCESS [1.346s]
[INFO] hadoop-yarn-client ................................ SUCCESS [12.937s]
[INFO] hadoop-yarn-applications .......................... SUCCESS [0.108s]
[INFO] hadoop-yarn-applications-distributedshell ......... SUCCESS [5.303s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [3.212s]
[INFO] hadoop-yarn-site .................................. SUCCESS [0.050s]
[INFO] hadoop-yarn-project ............................... SUCCESS [8.638s]
[INFO] hadoop-mapreduce-client ........................... SUCCESS [0.135s]
[INFO] hadoop-mapreduce-client-core ...................... SUCCESS [43.622s]
[INFO] hadoop-mapreduce-client-common .................... SUCCESS [36.329s]
[INFO] hadoop-mapreduce-client-shuffle ................... SUCCESS [6.058s]
[INFO] hadoop-mapreduce-client-app ....................... SUCCESS [20.058s]
[INFO] hadoop-mapreduce-client-hs ........................ SUCCESS [16.493s]
[INFO] hadoop-mapreduce-client-jobclient ................. SUCCESS [11.685s]
[INFO] hadoop-mapreduce-client-hs-plugins ................ SUCCESS [3.222s]
[INFO] Apache Hadoop MapReduce Examples .................. SUCCESS [12.656s]
[INFO] hadoop-mapreduce .................................. SUCCESS [8.060s]
[INFO] Apache Hadoop MapReduce Streaming ................. SUCCESS [8.994s]
[INFO] Apache Hadoop Distributed Copy .................... SUCCESS [15.886s]
[INFO] Apache Hadoop Archives ............................ SUCCESS [6.659s]
[INFO] Apache Hadoop Rumen ............................... SUCCESS [15.722s]
[INFO] Apache Hadoop Gridmix ............................. SUCCESS [11.778s]
[INFO] Apache Hadoop Data Join ........................... SUCCESS [5.953s]
[INFO] Apache Hadoop Extras .............................. SUCCESS [6.414s]
[INFO] Apache Hadoop Pipes ............................... FAILURE [3.746s]
[INFO] Apache Hadoop OpenStack support ................... SKIPPED
[INFO] Apache Hadoop Client .............................. SKIPPED
[INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED
[INFO] Apache Hadoop Scheduler Load Simulator ............ SKIPPED
[INFO] Apache Hadoop Tools Dist .......................... SKIPPED
[INFO] Apache Hadoop Tools ............................... SKIPPED
[INFO] Apache Hadoop Distribution ........................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 19:43.155s
[INFO] Finished at: Wed Jun 04 17:40:17 CST 2014
[INFO] Final Memory: 79M/239M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part ...<exec dir="/data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/native" executable="cmake" failonerror="true">... @ 5:123 in /data/hadoop/hadoop-2.4.0-src/hadoop-tools/hadoop-pipes/target/antrun/build-main.xml
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
根据网上提示( 下面需要再安装openssl-devel,安装命令yum install openssl-devel,此步不做的话会报如下错误
[exec] CMake Error at /usr/share/cmake/Modules/FindOpenSSL.cmake:66 (MESSAGE):
[exec] Could NOT find OpenSSL
[exec] Call Stack (most recent call first):
[exec] CMakeLists.txt:20 (find_package)
[exec]
[exec]
[exec] -- Configuring incomplete, errors occurred!
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluen ... oExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hadoop-pipes
)
错误连接:http://f.dataguru.cn/thread-189176-1-1.html
原因是:在安装openssl-devel,少写一个l,重新安装一下
解决方法:重新安装openssl-devel
yum install openssl-devel
5.3.3 编译总结
1、 必须安装(yum install lzo-devel zlib-devel gcc autoconf automake libtool ncurses-devel openssl-devel)
2、 必须安装(protobuf,CMake)编译工具
3、 必须配置(ANT、MAVEN、FindBugs)
4、 将maven库指向开源中国,这样就可以加快编译速度,即加快下载依赖jar包速度
5、 编译出错需求详细观察出错日志,根据错误日志分析原因再结束百度和Google解决错误;
5.4 Hadoop 配置文件
cd hadoop-2.4.0
cd etc/hadoop
core-site.xml
yarn-site.xml
hdfs-site.xml
mapred-site.xml
hadoop-env.sh
vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode57:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/current/tmp</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/app/hadoop/current/data</value>
</property>
</configuration>
vi yarn-site.xml
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode57:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode57:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>namenode57:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode57:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode57:18141</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.rpc-address</name>
<value>namenode57:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/app/hadoop/current/dfs</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/app/hadoop/current/data</value>
</property>
</configuration>
vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_55
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
5.5 复制Hadoop安装介质
scp hadoop-2.4.0.tar.gz hadoop@192.168.1.58:/app/hadoop/
分别复制到:192.168.1.58-60
5.6 格式化NameNode
./hdfs namenode -format
5.7 启动Hadoop
cd /app/hadoop/hadoop-2.4.0/
./start-all.sh
查看进程hadoop进程:
jps
ps –ef|grep java
14/06/04 07:48:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on
启动时一直报错,准备编译Hadoop64位依赖包(5.3编译Hadoop本地库),再进行替换
替换掉32位的native库
删除原32位的native库
cd /app/hadoop/hadoop-2.4.0/lib
rm -rf native/
将5.3节编译好native 64位的库复制到:/app/hadoop/hadoop-2.4.0/lib
cd /data/hadoop/hadoop-2.4.0-src/hadoop-dist/target/hadoop-2.4.0/lib
cp -r ./native /app/hadoop/hadoop-2.4.0/lib/
错误1:
2014-06-04 18:30:57,450 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:98)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:220)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:186)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:357)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
2014-06-04 18:30:57,458 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
解决方法:
vi /app/hadoop/hadoop-2.4.0/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
改为
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
总结:
从2.0以后启动和停止hadoop的命令start-all.sh和stop-all.sh不建议使用,开始摒弃掉取而代之的将使用start-dfs.sh和start-yarn.sh启动hadoop,详细请看官方说明。
5.8 检查Hadoop启动情况
5.8.1 检查namenode是否启动
在192.168.1.57检查
jps
5.8.2 编译Hadoop
分别检查192.168.1.58~60
Jps
6 Hadoop2.4验证
6.1 Hadoop 控制台
HDFS
http://192.168.1.57:50070/dfshealth.html#tab-overview
http://namenode57:8088/cluster/
6.2 Hadoop wordcount运行
6.2.1 创建文件夹及将本地文件复制到:hdfs系统中
1、创建/tmp/input文件夹
hadoop fs -mkdir /tmp
hadoop fs -mkdir /tmp/input
2、将本地文件复制到hdfs系统中
hadoop fs -put /usr/hadoop/file* /tmp/input
3、查看test.txt文件是否成功上传到hdfs上
hadoop fs -ls /tmp/input
6.2.1 运行wordcount程序
hadoop jar hadoop-mapreduce-examples-2.4.0.jar wordcount /tmp/input /tmp/output
7 Hadoop安装错误
7.1 hadoop2.4无法连接ResourceManager问题
错误日志:
ountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:07,154 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:08,156 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:09,159 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:10,161 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:11,164 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:12,166 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:13,169 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-06-05 07:01:14,171 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
解决方法,在yarn.xml文件里配置,所有机器都修改
<property>
<name>yarn.resourcemanager.hostname</name>
<value>namenode57</value>
</property>