首页 > 代码库 > 更改hadoop native库文件后datanode故障

更改hadoop native库文件后datanode故障

hadoop是用cloudra的官方yum源安装的,服务器是CentOS6.3 64位操作系统,自己写的mapreduce执行的时候hadoop会提示以下错误:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

 

网上说是这样的

Hadoop的native是在32 bit环境下编译的,在64bit环境下运行会有问题,所以需要下载hadoop的源码在64bit环境下重新编译

 

然后在官方文档里找到了相应的解释

链接:http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/NativeLibraries.html

The pre-built 32-bit i386-Linux native hadoop library is available as part of the hadoop distribution and is located in the lib/native directory. You can download the hadoop distribution from Hadoop Common Releases.Be sure to install the zlib and/or gzip development packages - whichever compression codecs you want to use with your deployment.

 

于是,按照官档里的说明在服务器上Build出新的hadoop,把native里面的库文件替换默认的的库文件,搞定,重启hadoop所有服务,自己写的mapreduce可以用了,那么接下来问题来了……发现hdfs进入了Safemode,说有个多少多少块corrupted,然后发现重启期间有同事在使用mapreduce,怀疑是这个原因造成了数据的损坏。由于数据有备份,于是手动解除Safemode,用hadoo fsck删掉了坏掉的数据块,此时是不提示corrupted了,可是惊奇的发现,datanode没有起来!

Hadoop datanode is dead and pid file exists

 

紧接着,找到存在的pid file,rm之,再起!还是不行!找到datanode的日志,发现有FATAL信息!

2014-11-26 14:35:49,577 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMainjava.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory.createDescriptor0(Ljava/lang/String;Ljava/lang/String;I)Ljava/io/FileDescriptor;        at org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory.createDescriptor0(Native Method)        at org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory.create(SharedFileDescriptorFactory.java:87)        at org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.<init>(ShortCircuitRegistry.java:169)        at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:586)        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:773)        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:292)        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1893)        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1780)        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1827)        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2003)        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2027)2014-11-26 14:35:49,580 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 12014-11-26 14:35:49,582 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:/************************************************************SHUTDOWN_MSG: Shutting down DataNode at carweb94/10.14.1.94

 

搜索了下,发现是native库文件的问题!好吧,这下找到元凶了,把默认的native库文件替换回来了,问题解决了。排错的过程中,这篇文章给了很大启发。

http://stackoverflow.com/questions/26467568/hadoop-2-5-0-failed-to-start-datanode

 

更改hadoop native库文件后datanode故障