首页 > 代码库 > Hadoop安装

Hadoop安装

 

 

 

先上传hadoop的安装包到服务器上去/home/hpc/

 

         注意:hadoop2.x的配置文件$HADOOP_HOME/etc/hadoop

         伪分布式需要修改5个配置文件

 

      4.1配置hadoop--配置文件目录----/etc/hadoop/

 

         第一个文件:hadoop-env.sh

                   vi hadoop-env.sh

                   #第25行(改成系统当前的jdk版本)

                   export JAVA_HOME=/usr/java/jdk1.8.0_91

         第二个文件:core-site.xml

                   <!-- 指定HADOOP所使用的文件系统schema(URI),HDFS的老大(NameNode)的地址 -->

                   <property>

                            <name>fs.default.name</name>

                            <value>hdfs://主机名:9000</value>

                   <description>change you own hadoop hostname</description>

                   </property>

                   <property>

                            <name>hadoop.tmp.dir</name>

                            <value>/usr/local/hadoop/tmp</value>

                   </property>

         第三个:hdfs-site.xml  

                   <!-- 指定HDFS副本的数量 -->

                   <property>

                            <name>dfs.name.dir</name>

                            <value>/usr/hadoop/hdfs/name</value>

                            <description>namenode上存储hdfs名字空间元数据 </description>

                   </property>

 

                   <property>

                            <name>dfs.data.dir</name>

                            <value>/usr/hadoop/hdfs/data</value>

                            <description>datanode上数据块的物理存储位置</description>

                   </property>

 

                   <property>

                            <name>dfs.replication</name>

                            <value>1</value>

                            <description>副本个数,配置默认是3,应小于datanode机器数量</description>

                   </property>

                   <property>

                            <name>dfs.secondary.http.address</name>

                            <value>主机名:50070</value>

                   </property>

 

   

         第四个:mapred-site.xml (mv mapred-site.xml.template mapred-site.xml)

                   mv mapred-site.xml.template mapred-site.xml

                   vim mapred-site.xml

                   <!-- 指定mr运行在yarn上 -->

                   <property>

                            <name>dfs.replication</name>

                            <value>1</value>

                   </property>

                   <property>

                            <name>dfs.permissions</name>

                            <value>false</value>

                   </property>

                   <property>

                            <name>mapreduce.framework.name</name>

                            <value>yarn</value>

                   </property>

         第五个:yarn-site.xml

                   <!-- 指定YARN的老大(ResourceManager)的地址 -->

                   <property>

                            <name>yarn.resourcemanager.hostname</name>

                            <value>主机名</value>

                   </property>

                   <property>

                            <name>yarn.nodemanager.aux-services</name>

                            <value>mapreduce_shuffle</value>

                   </property>

         3.2将hadoop添加到环境变量

         vim /etc/profile

                   export JAVA_HOME=/usr/java/jdk1.8.0_91

                   export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

                   export HADOOP_HOME=/home/qpx/hadoop-2.8.0

                   export PATH=.:$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

                   export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

     export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

  重新编译环境变量

         source /etc/profile

        

         3.3格式化namenode(是对namenode进行初始化)-----在bin文件夹下

                   ./hadoop namenode –format

      或者./hadoop-daemon.sh start namenode 或者  ./start-dfs.sh

                   netstat -lnp|grep 9000

                  将当前目录的文件上传到hadoop的云存储环境的/tmp/input中

     bin/hdfs dfs -put NOTICE.txt /

                   或者 hadoop fs -put **.txt /           

         3.4启动hadoop

                   先启动HDFS

                   sbin/start-dfs.sh           

                   再启动YARN

                   sbin/start-yarn.sh          

         3.5验证是否启动成功

                   使用jps命令验证

                   27408 NameNode

                   28218 Jps

                  27643 SecondaryNameNode

                   28066 NodeManager

                   27803 ResourceManager

                   27512 DataNode

     出现错误的情况:上面2个红色的进程必须启动,如未能全部启动,删除临时文件夹的内容user/hadoop/hdfs/,     

         错误:死了一个datanode

                            1)查看logs/xx.datanode.log

                            2)将namenode的clusterid复制一下,查看下datanode磁盘存储目录

                            3)在datanode目录下找到VERSION文件,修改clusterid为namenode的clusterid

                            4)重新启动namenode和datanode

           浏览器访问

                   http://192.168.1.101:50070 (HDFS管理界面)

Hadoop安装