首页 > 代码库 > ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1分布式环境部署
ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1分布式环境部署
#sudo apt-get install oracle-java7-installer |
#sudo vi /etc/environment |
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/lib/jvm/java-7oracle/bin” |
CLASSPATH="/usr/lib/jvm/java-7-oracle/lib” JAVA_HOME="/usr/lib/jvm/java-7-oracle” JRE_HOME="/usr/lib/jvm/java-7-oracle/jre” |
告诉系统,我们使用的sun的JDK,而非OpenJDK了
#sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/java-7-oracle/bin/java 300 #sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/java-7-oracle/bin/javac 300 #sudo update-alternatives --config java |
这时会有几个选项,如下图选择2,然后再执行java -version就可以看到最新版本
1)在parallels的硬件网络中选择如下所示,这个时候这个ping www.163.com就会ping通了
2)点击Parallels左上角=》文件=》克隆,克隆三台虚拟机名字分别命名为:m2,s1,s2(克隆前要先停止虚拟机)
执行sudo vi /etc/hostname ,修改各自的主机名称,如果生效需要重启。
在m1、m2、s1、s2上分别执行ifconfig查看被分配到的IP地址,然后执行sudo vi /etc/hosts,我的机器修改如下图,然后执行”sudo /etc/init.d/networking restart"生效:
3)配置shhd无验证登录(我使用的是root帐号)
安装SSH工具
#sudo apt-get install ssh openssh-server (如果默认执行ssh存在,就不用安装了) |
在每台机器分别输入ssh-keygen,一路回车,然后会在用户的.ssh目录生成id_rsa和id_rsa.pub文件。
在m1上执行:
#scp -r root@m2:/root/.ssh/id_rsa.pub ~/.ssh/m2.pub #scp -r root@s1:/root/.ssh/id_rsa.pub ~/.ssh/s1.pub #scp -r root@s2:/root/.ssh/id_rsa.pub ~/.ssh/s2.pub #cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys #cat ~/.ssh/m2.pub >> ~/.ssh/authorized_keys #cat ~/.ssh/s1.pub >> ~/.ssh/authorized_keys #cat ~/.ssh/s2.pub >> ~/.ssh/authorized_keys #scp -r ~/.ssh/authorized_keys root@m2:~/.ssh/ #scp -r ~/.ssh/authorized_keys root@s1:~/.ssh/ #scp -r ~/.ssh/authorized_keys root@s2:~/.ssh/ |
1)配置zoo.cfg(默认是没有zoo.cfg,将zoo_sample.cfg复制一份,并命名为zoo.cfg)
root@m1:/home/hadoop/zookeeper-3.4.5/conf# vi zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/home/hadoop/zookeeper-3.4.5/data dataLogDir=/home/hadoop/zookeeper-3.4.5/logs server.1=m1:2888:3888 server.2=m2:2888:3888 server.3=s1:2888:3888 server.4=s2:2888:3888 # the port at which the clients will connect clientPort=2181 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 |
root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@m2:/home/hadoop root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@s1:/home/hadoop root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@s2:/home/hadoop |
修改以下7个配置文件:
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi hadoop-env.sh export JAVA_HOME=/usr/lib/jvm/java-7-oracle #export JAVA_HOME=${JAVA_HOME} |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi yarn-env.sh # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # User for YARN daemons export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn} export JAVA_HOME=/usr/lib/jvm/java-7-oracle |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="http://www.mamicode.com/configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>m1,m2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.m1</name> <value>m1:9000</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.m2</name> <value>m2:9000</value> </property> <property> <name>dfs.namenode.http-address.mycluster.m1</name> <value>m1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.m2</name> <value>m2:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://m1:8485;m2:8485/mycluster</value> </property> <property> <name>dfs.ha.automatic-failover.enabled.mycluster</name> <value>true</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hadoop/hadoop-2.2.0/tmp/journal</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> </configuration> |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="http://www.mamicode.com/configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>Execution framework set to Hadoop YARN.</description> </property> </configuration> |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="http://www.mamicode.com/configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>m1:2181,m2:2181,s1:2181,s2:2181</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-2.2.0/tmp</value> <description></description> </property> </configuration> |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi yarn-site.xml <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>m1</value> </property> </configuration> |
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi slaves s1 s2 |
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh start JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower root@m1:/home/hadoop# |
root@s1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh start JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED root@s1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: leader root@s1:/home/hadoop# |
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkCli.sh Connecting to localhost:2181 |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs zkfc -formatZK 14/07/27 00:31:59 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at m1/192.168.1.50:9000 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:host.name=m1 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_65 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/home/hadoop/hadoop-2.2.0/etc/hadoop:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-math-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-lang-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/hadoop-auth-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsch-0.1.42.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/junit-4.8.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/stax-api-1.0.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/zookeeper-3.4.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-nfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/hamcrest-core-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/junit-4.10.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/hadoop/hadoop-2.2.0/contrib/capacity-scheduler/*.jar 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop-2.2.0/lib/native 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.version=3.11.0-15-generic 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.name=root 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.home=/root 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=m1:2181,m2:2181,s1:2181,s2:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@5990054a 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Opening socket connection to server m1/192.168.1.50:2181. Will not attempt to authenticate using SASL (unknown error) 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Socket connection established to m1/192.168.1.50:2181, initiating session 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Session establishment complete on server m1/192.168.1.50:2181, sessionid = 0x147737cd5d30001, negotiated timeout = 5000 =============================================== The configured parent znode /hadoop-ha/mycluster already exists. Are you sure you want to clear all failover information from ZooKeeper? WARNING: Before proceeding, ensure that all HDFS services and failover controllers are stopped! =============================================== Proceed formatting /hadoop-ha/mycluster? (Y or N) 14/07/27 00:32:00 INFO ha.ActiveStandbyElector: Session connected. y 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Recursively deleting /hadoop-ha/mycluster from ZK... 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Successfully deleted /hadoop-ha/mycluster from ZK. 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK. 14/07/27 00:32:13 INFO zookeeper.ClientCnxn: EventThread shut down 14/07/27 00:32:13 INFO zookeeper.ZooKeeper: Session: 0x147737cd5d30001 closed root@m1:/home/hadoop# |
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [hadoop-ha, zookeeper] [zk: localhost:2181(CONNECTED) 1] |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start journalnode starting journalnode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-journalnode-m1.out root@m1:/home/hadoop# jps 2884 JournalNode 2553 QuorumPeerMain 2922 Jps root@m1:/home/hadoop# |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs namenode –format |
root@m1:/home/hadoop/hadoop-2.2.0/bin/hdfs namenode -format -clusterId m1 |
3)在m1上启动刚才格式化的 namenode,执行命令后,浏览:http://m1:50070/dfshealth.jsp可以看到m1的状态
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start namenode |
root@m2:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs namenode –bootstrapStandby |
root@m2:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start namenode |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemons.sh start datanode s2: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-datanode-s2.out s1: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-datanode-s1.out root@m1:/home/hadoop# |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/start-yarn.sh starting yarn daemons starting resourcemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-resourcemanager-m1.out s1: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-nodemanager-s1.out s2: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-nodemanager-s2.out root@m1:/home/hadoop# |
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start zkfc starting zkfc, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-zkfc-m1.out root@m1:/home/hadoop# |
root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls / Found 2 items drwx------ - root supergroup 0 2014-07-17 23:54 /tmp drwxr-xr-x - lion supergroup 0 2014-07-21 00:40 /user root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -mkdir /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls / Found 3 items drwxr-xr-x - root supergroup 0 2014-07-27 01:20 /input drwx------ - root supergroup 0 2014-07-17 23:54 /tmp drwxr-xr-x - lion supergroup 0 2014-07-21 00:40 /user root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -put hadoop.cmd /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls /input Found 1 items -rw-r--r-- 3 root supergroup 7530 2014-07-27 01:20 /input/hadoop.cmd root@m1:/home/hadoop/hadoop-2.2.0/bin# |
root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hadoop jar /home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output 14/07/27 01:22:41 INFO client.RMProxy: Connecting to ResourceManager at m1/192.168.1.50:8032 14/07/27 01:22:43 INFO input.FileInputFormat: Total input paths to process : 1 14/07/27 01:22:44 INFO mapreduce.JobSubmitter: number of splits:1 14/07/27 01:22:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/07/27 01:22:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406394452186_0001 14/07/27 01:22:46 INFO impl.YarnClientImpl: Submitted application application_1406394452186_0001 to ResourceManager at m1/192.168.1.50:8032 14/07/27 01:22:46 INFO mapreduce.Job: The url to track the job: http://m1:8088/proxy/application_1406394452186_0001/ 14/07/27 01:22:46 INFO mapreduce.Job: Running job: job_1406394452186_0001 14/07/27 01:23:10 INFO mapreduce.Job: Job job_1406394452186_0001 running in uber mode : false 14/07/27 01:23:10 INFO mapreduce.Job: map 0% reduce 0% 14/07/27 01:23:31 INFO mapreduce.Job: map 100% reduce 0% 14/07/27 01:23:48 INFO mapreduce.Job: map 100% reduce 100% 14/07/27 01:23:48 INFO mapreduce.Job: Job job_1406394452186_0001 completed successfully 14/07/27 01:23:49 INFO mapreduce.Job: Counters: 43 File System Counters FILE: Number of bytes read=6574 FILE: Number of bytes written=175057 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=7628 HDFS: Number of bytes written=5088 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=18062 Total time spent by all reduces in occupied slots (ms)=14807 Map-Reduce Framework Map input records=240 Map output records=827 Map output bytes=9965 Map output materialized bytes=6574 Input split bytes=98 Combine input records=827 Combine output records=373 Reduce input groups=373 Reduce shuffle bytes=6574 Reduce input records=373 Reduce output records=373 Spilled Records=746 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=335 CPU time spent (ms)=2960 Physical memory (bytes) snapshot=270057472 Virtual memory (bytes) snapshot=1990762496 Total committed heap usage (bytes)=136450048 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=7530 File Output Format Counters Bytes Written=5088 root@m1:/home/hadoop/hadoop-2.2.0/bin# |
root@m1:/home/hadoop/hadoop-2.2.0/bin# jps 5492 Jps 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 3898 NameNode 4075 ResourceManager root@m1:/home/hadoop/hadoop-2.2.0/bin# kill -9 3898 root@m1:/home/hadoop/hadoop-2.2.0/bin# jps 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager 5627 Jps root@m1:/home/hadoop/hadoop-2.2.0/bin# |
这时候在m2上的HDFS和mapreduce还是可以正常运行的,虽然m1上的namenode进程已经被kill掉,但不影响使用这就是故障转移的优势!
7、Hbase-0.96.2-hadoop2(启动双HMaster的配置,m1是主HMaster,m2是从HMaster)
1)、修改hbase-env.sh配置,主要修JAVA_HOME的目录,以及HBASE_MANAGES_ZK
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi hbase-env.sh # #/** # * Copyright 2007 The Apache Software Foundation # * # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ # Set environment variables here. # This script sets variables multiple times over the course of starting an hbase process, # so try to keep things idempotent unless you want to take an even deeper look # into the startup scripts (bin/hbase, etc.) # The java implementation to use. Java 1.6 required. export JAVA_HOME=/usr/lib/jvm/java-7-oracle # Extra Java CLASSPATH elements. Optional. # export HBASE_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HBASE_HEAPSIZE=1000 # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, # see http://wiki.apache.org/hadoop/PerformanceTuning export HBASE_OPTS="-XX:+UseConcMarkSweepGC" # Uncomment one of the below three options to enable java garbage collection logging for the server-side processes. # This enables basic gc logging to the .out file. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment one of the below three options to enable java garbage collection logging for the client processes. # This enables basic gc logging to the .out file. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment below if you intend to use the EXPERIMENTAL off heap cache. # export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=" # Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident #HBASE_REGIONSERVER_MLOCK=true #HBASE_REGIONSERVER_UID="hbase" # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER # The scheduling priority for daemon processes. See ‘man nice‘. # export HBASE_NICENESS=10 # The directory where pid files are stored. /tmp by default. # export HBASE_PID_DIR=/var/hadoop/pids # Seconds to sleep between slave commands. Unset by default. This # can be useful in large clusters, where, e.g., slave rsyncs can # otherwise arrive faster than the master can service them. # export HBASE_SLAVE_SLEEP=0.1 # Tell HBase whether it should manage it‘s own instance of Zookeeper or not. export HBASE_MANAGES_ZK=false #这个值为false时,表示启动的是独立的zookeeper。而配置成true则是hbase自带的zookeeper。 # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the # RFA appender. Please refer to the log4j.properties file to see more details on this appender. # In case one needs to do log rolling on a date change, one should set the environment property # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA". # For example: # HBASE_ROOT_LOGGER=INFO,DRFA # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as # DRFA doesn‘t put any cap on the log size. Please refer to HBase-5655 for more context. |
2)、修改hbase-site.xml配置
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi hbase-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="http://www.mamicode.com/configuration.xsl"?> <!-- /** * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <property> <!--这里用来设置region server的共享目录,用来持久化Hbase。URL需要是‘完全正确‘的,还要包含文件系统的scheme。--> <name>hbase.rootdir</name> <value>hdfs://mycluster/hbase</value><!--这里必须跟hadoop的core-site.xml中的配置一样--> </property> <property> <!--Hbase的运行模式。false是单机模式,true是分布式模式。若为false,Hbase和Zookeeper会运行在同一个JVM里面。默认: false--> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.tmp.dir</name> <value>/home/hadoop/hbase-0.96.2-hadoop2/tmp</value> </property> <property> <!--这里是对的,只配置端口,为了配置多个HMaster--> <name>hbase.master</name> <value>60000</value> </property> <property> <!--配置zookeeper--> <name>hbase.zookeeper.quorum</name> <value>m1,m2,s1,s2</value> </property> <property> <!--配置zookeeperp客户端连接端口,如果hbase.zookeeper.property.clientPort不配的话,将会默认一个端口,可能就不是你的zookeeper提供的3351~3353这些有用的端口。--> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hadoop/zookeeper-3.4.5/data</value> </property> </configuration> |
2)、修改regionservers文件
通常部署master的机器上不就部署slave了,用两台集群做Hbase从服务器
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi regionservers s1 s2 |
3)、创建hadoop的hdfs-site.xml的软连接到hbase的配置文件目录
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ll 总用量 40 drwxr-xr-x 2 root root 4096 Jul 27 09:15 ./ drwxr-xr-x 9 root root 4096 Jul 20 21:40 ../ -rw-r--r-- 1 root staff 1026 Mar 25 06:29 hadoop-metrics2-hbase.properties -rw-r--r-- 1 root staff 4023 Mar 25 06:29 hbase-env.cmd -rw-r--r-- 1 root staff 7129 Jul 27 08:58 hbase-env.sh -rw-r--r-- 1 root staff 2257 Mar 25 06:29 hbase-policy.xml -rw-r--r-- 1 root staff 2550 Jul 27 09:10 hbase-site.xml -rw-r--r-- 1 root staff 3451 Mar 25 06:29 log4j.properties -rw-r--r-- 1 root staff 6 Jul 20 21:38 regionservers root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ln -s /home/hadoop/hadoop-2.2.0/etc/hadoop/hdfs-site.xml hdfs-site.xml root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ll 总用量 40 drwxr-xr-x 2 root root 4096 Jul 27 09:16 ./ drwxr-xr-x 9 root root 4096 Jul 20 21:40 ../ -rw-r--r-- 1 root staff 1026 Mar 25 06:29 hadoop-metrics2-hbase.properties -rw-r--r-- 1 root staff 4023 Mar 25 06:29 hbase-env.cmd -rw-r--r-- 1 root staff 7129 Jul 27 08:58 hbase-env.sh -rw-r--r-- 1 root staff 2257 Mar 25 06:29 hbase-policy.xml -rw-r--r-- 1 root staff 2550 Jul 27 09:10 hbase-site.xml lrwxrwxrwx 1 root root 50 Jul 27 09:16 hdfs-site.xml -> /home/hadoop/hadoop-2.2.0/etc/hadoop/hdfs-site.xml* -rw-r--r-- 1 root staff 3451 Mar 25 06:29 log4j.properties -rw-r--r-- 1 root staff 6 Jul 20 21:38 regionservers root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# |
3)、hbase0.96.2版本的jar包不需要复制,官方提供的是已经打包好的
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# ls | grep hadoop hadoop-annotations-2.2.0.jar hadoop-auth-2.2.0.jar hadoop-client-2.2.0.jar hadoop-common-2.2.0.jar hadoop-hdfs-2.2.0.jar hadoop-hdfs-2.2.0-tests.jar hadoop-mapreduce-client-app-2.2.0.jar hadoop-mapreduce-client-common-2.2.0.jar hadoop-mapreduce-client-core-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0-tests.jar hadoop-mapreduce-client-shuffle-2.2.0.jar hadoop-yarn-api-2.2.0.jar hadoop-yarn-client-2.2.0.jar hadoop-yarn-common-2.2.0.jar hadoop-yarn-server-common-2.2.0.jar hadoop-yarn-server-nodemanager-2.2.0.jar hbase-client-0.96.2-hadoop2.jar hbase-common-0.96.2-hadoop2.jar hbase-common-0.96.2-hadoop2-tests.jar hbase-examples-0.96.2-hadoop2.jar hbase-hadoop2-compat-0.96.2-hadoop2.jar hbase-hadoop-compat-0.96.2-hadoop2.jar hbase-it-0.96.2-hadoop2.jar hbase-it-0.96.2-hadoop2-tests.jar hbase-prefix-tree-0.96.2-hadoop2.jar hbase-protocol-0.96.2-hadoop2.jar hbase-server-0.96.2-hadoop2.jar hbase-server-0.96.2-hadoop2-tests.jar hbase-shell-0.96.2-hadoop2.jar hbase-testing-util-0.96.2-hadoop2.jar hbase-thrift-0.96.2-hadoop2.jar root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# |
4)、将m1上面的hbase0.96.2复制到m2,s1,s2同样的目录中
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@m2:/home/hadoop root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@s1:/home/hadoop root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@s2:/home/hadoop |
5)、在m1上启动hbase0.96.2,执行命令后,浏览网址可以看效果:http://m1:60010/master-status
root@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/start-hbase.sh starting master, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-master-m1.out s1: starting regionserver, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-regionserver-s1.out s2: starting regionserver, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-regionserver-s2.out root@m1:/home/hadoop# jps 6688 NameNode 7540 HMaster 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 7769 Jps 4075 ResourceManager root@m1:/home/hadoop# |
6)、在m1上用shell测试连接hbase
root@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 09:31:07,601 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter ‘help<RETURN>‘ for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 0 row(s) in 2.8030 seconds => [] hbase(main):002:0> version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):003:0> status 2 servers, 0 dead, 1.0000 average load hbase(main):004:0> create ‘test_idoall_org‘,‘uid‘,‘name‘ 0 row(s) in 0.5800 seconds => Hbase::Table - test_idoall_org hbase(main):005:0> list TABLE test_idoall_org 1 row(s) in 0.0320 seconds => ["test_idoall_org"] hbase(main):006:0> put ‘test_idoall_org‘,‘10086‘,‘name:idoall‘,‘idoallvalue‘ 0 row(s) in 0.1090 seconds ^ hbase(main):009:0> get ‘test_idoall_org‘,‘10086‘ COLUMN CELL name:idoall timestamp=1406424831473, value=http://www.mamicode.com/idoallvalue 1 row(s) in 0.0450 seconds hbase(main):010:0> scan ‘test_idoall_org‘ ROW COLUMN+CELL 10086 column=name:idoall, timestamp=1406424831473, value=http://www.mamicode.com/idoallvalue 1 row(s) in 0.0620 seconds hbase(main):011:0> |
7)、在m2上启动hbase,同样执行命令后,在浏览器打开网址也可以看到m2上的hbase状态:http://m2:60010/master-status
root@m2:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase-daemon.sh start master starting master, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-master-m2.out root@m2:/home/hadoop# |
8)、测试m1和m2的主从备份切换
a)这时在浏览器打开http://m1:60010/master-status和http://m2:60010/master-status,可以看到下图的状态
b)我们在m1上停止掉hbase的进程,再打开网址,会发现m1已经打不开,而m2的hbase集群状态已经被改变
root@m1:/home/hadoop# jps 6688 NameNode 7540 HMaster 2884 JournalNode 8645 Jps 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager root@m1:/home/hadoop# kill -9 7540 root@m1:/home/hadoop# jps 6688 NameNode 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager 8655 HMaster 8719 Jps root@m1:/home/hadoop# |
至此,hbase已经配置完,并且主从故障转移是可用的。
8、在ubuntu12.04的m1上面安装mysql5.5.x
1)、apt-get install mysql-server mysql-client mysql-common过程中会弹出一个界面,让你输入root的密码。我设置的是123456
安装后可以测试下mysql的连接状态:mysql -uroot -p123456
可以用service mysql stop/service mysql start来启动和停止mysql状态
2)、授权可以远程访问mysql
root@m1:/home/hadoop# mysql -uroot -p123456 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 36 Server version: 5.5.22-0ubuntu1 (Ubuntu) Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement. mysql> grant all on *.* to ‘root‘@‘%‘ identified by ‘123456‘ WITH GRANT OPTION; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) mysql> quit Bye root@m1:/home/hadoop# |
3)、如果还无法远程连接,打开:vi /etc/mysql/my.cnf。将bind-address=127.0.0.1,改为本机ip,重新启动mysql
9、hive 0.13.1安装(在m1上操作)
1)、将apache-hive-0.13.1-bin.tar.gz解压到/home/hadoop/hive-0.13.1root@m1:/home/hadoop/hive-0.13.1/conf# cp hive-env.sh.template hive-env.sh root@m1:/home/hadoop/hive-0.13.1/conf# cp hive-default.xml.template hive-site.xml |
root@m1:/home/hadoop/hive-0.13.1/conf# vi hive-env.sh HADOOP_HOME=/home/hadoop/hadoop-2.2.0 |
root@m1:/home/hadoop/hive-0.13.1/conf# vi hive-site.xml <property> <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="http://www.mamicode.com/configuration.xsl"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- <!--这里是重点的地方,为了跟Hbase整合,所以千万别写错了,hive.aux.jars.path 的value中间不允许有空格,回车,换行什么的,全部写在一行上就行了,不然会出各种错 <configuration> <property> <!--hive 默认的数据文件存储路径,通常为 HDFS 可写的路径--> <name>hive.metastore.warehouse.dir</name> <value>hdfs://mycluster/user/hive/warehouse</value> </property> <property> </property> <description>The list of zookeeper servers to talk to. This isonly needed for read/write locks.</description> <!--HDFS路径,用于存储不同 map/reduce 阶段的执行计划和这些阶段的中间输出结果。--> <name>hive.exec.scratchdir</name> <value>hdfs://mycluster/user/hive/scratchdir</value> </property> <property> <!--Hive 实时查询日志所在的目录,如果该值为空,将不创建实时的查询日志。--> <name>hive.querylog.location</name> <value>/home/hadoop/hive-0.13.1/logs</value> </property> <property> <!--JDBC连接字符串,默认jdbc:derby:;databaseName=metastore_db;create=true;--> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://m1:3306/hiveMeta?createDatabaseIfNotExist=true</value> </property> <property> <!--JDBC的driver,默认org.apache.derby.jdbc.EmbeddedDriver;--> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <property> <!--当用户自定义了UDF或者SerDe,这些插件的jar都要放到这个目录下,无默认值;--> <!--这里是重点的地方,为了跟Hbase整合,所以千万别写错了,hive.aux.jars.path 的value中间不允许有空格,回车,换行什么的,全部写在一行上就行了,不然会出各种错--> <name>hive.aux.jars.path</name> <value>file:///home/hadoop/hive-0.13.1/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-hadoop2-compat-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hive-h base-handler-0.13.1.jar,file:///home/hadoop/hive-0.13.1/lib/protobuf-java-2.5.0.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-client-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-common-0.96.2-hadoop2 .jar,file:///home/hadoop/hive-0.13.1/lib/hbase-protocol-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-server-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/zookeeper-3.4.5.jar,file:///home/had oop/hive-0.13.1/lib/guava-11.0.2.jar,file:///home/hadoop/hive-0.13.1/lib/htrace-core-2.04.jar</value> </property> <property> <!--zk地址列表,默认是空;没用配置hive.zookeeper.quorum会导致无法并发执行hive ql请求和导致数据异常--> <name>hive.zookeeper.quorum</name> <value>m1,m2,s1,s2</value> </property> </configuration> |
5)、hive-site.xml中hive.aux.jars.path配置项包含的jar,hive-hbase-handler-0.13.1.jar和guava-11.0.2.jar是默认就有的,只需要执行以下命令,将其他的从hadoop/zookeeper/hbase中复制过来即可
root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/protobuf-java-2.5.0.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-client-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-common-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-protocol-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-server-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-hadoop2-compat-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/htrace-core-2.04.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/zookeeper-3.4.5/dist-maven/zookeeper-3.4.5.jar /home/hadoop/hive-0.13.1/lib |
6)、mysql的odbc驱动,可以到这里下载http://dev.mysql.com/downloads/connector/j/,解压后,将目录中的mysql-connector-java-5.1.31-bin.jar复制到 /home/hadoop/hive-0.13.1/lib
7)、创建测试数据,以及数据仓库目录
root@m1:/home/hadoop/hive-0.13.1/conf# vi /home/hadoop/hive-0.13.1/testdata001.dat 12306,mname,yname 10086,myidoall,youidoall /home/hadoop/hadoop-2.2.0/bin/hadoop fs -mkdir -p /user/hive/warehouse |
8)、使用shell命令,测试hive
root@m1:/home/hadoop# /home/hadoop/hive-0.13.1/bin/hive 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/07/27 11:17:35 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar:file:/home/hadoop/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties hive> show databases; OK default Time taken: 0.464 seconds, Fetched: 1 row(s) hive> create database testidoall; OK Time taken: 0.279 seconds hive> show databases; OK default testidoall Time taken: 0.021 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.039 seconds hive> create external table testtable(uid int,myname string,youname string) row format delimited fields terminated by ‘,‘ location ‘/user/hive/warehouse/testtable‘; OK Time taken: 0.205 seconds hive> LOAD DATA LOCAL INPATH ‘/home/hadoop/hive-0.13.1/testdata001.dat‘ OVERWRITE INTO TABLE testtable; Copying data from file:/home/hadoop/hive-0.13.1/testdata001.dat Copying file: file:/home/hadoop/hive-0.13.1/testdata001.dat Loading data to table testidoall.testtable rmr: DEPRECATED: Please use ‘rm -r‘ instead. Deleted hdfs://mycluster/user/hive/warehouse/testtable Table testidoall.testtable stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0] OK Time taken: 0.77 seconds hive> select * from testtable; OK 12306 mname yname 10086 myidoall youidoall Time taken: 0.279 seconds, Fetched: 2 row(s) hive> |
至此,hive已经安装完成。
10、hive to hbase(Hive中的表数据导入到Hbase中去)
1)、创建hbase可以识别的表root@m1:/home/hadoop# /home/hadoop/hive-0.13.1/bin/hive 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/07/27 11:33:53 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar:file:/home/hadoop/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties hive> show databases; OK default testidoall Time taken: 0.45 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.021 seconds hive> show tables; OK testtable Time taken: 0.032 seconds, Fetched: 1 row(s) hive> CREATE TABLE hive2hbase_idoall(key int, value string) STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler‘ WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hive2hbase_idoall"); OK Time taken: 2.332 seconds hive> show tables; OK hive2hbase_idoall testtable Time taken: 0.036 seconds, Fetched: 2 row(s) hive> |
2)、创建本地表,用来存储数据,然后插入到Hbase用的,相当于一张中间表了。同时将之前的测试数据导入到这张中间表。
hive> create table hive2hbase_idoall_middle(foo int,bar string)row format delimited fields terminated by ‘,‘; OK Time taken: 0.086 seconds hive> show tables; OK hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.03 seconds, Fetched: 3 row(s) hive> load data local inpath ‘/home/hadoop/hive-0.13.1/testdata001.dat‘ overwrite into table hive2hbase_idoall_middle; Copying data from file:/home/hadoop/hive-0.13.1/testdata001.dat Copying file: file:/home/hadoop/hive-0.13.1/testdata001.dat Loading data to table testidoall.hive2hbase_idoall_middle rmr: DEPRECATED: Please use ‘rm -r‘ instead. Deleted hdfs://mycluster/user/hive/warehouse/testidoall.db/hive2hbase_idoall_middle Table testidoall.hive2hbase_idoall_middle stats: [numFiles=1, numRows=0, totalSize=43, rawDataSize=0] OK Time taken: 0.683 seconds hive> |
3)、将本地中间表(hive2hbase_idoall_middle)导入到表(hive2hbase_idoall)中,会自动同步到hbase。
hive> insert overwrite table hive2hbase_idoall select * from hive2hbase_idoall_middle; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there‘s no reduce operator Starting Job = job_1406394452186_0002, Tracking URL = http://m1:8088/proxy/application_1406394452186_0002/ Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1406394452186_0002 Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0 2014-07-27 11:44:11,491 Stage-0 map = 0%, reduce = 0% 2014-07-27 11:44:22,684 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.51 sec MapReduce Total cumulative CPU time: 1 seconds 510 msec Ended Job = job_1406394452186_0002 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 1.51 sec HDFS Read: 288 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 510 msec OK Time taken: 25.613 seconds hive> select * from hive2hbase_idoall; OK 10086 myidoall 12306 mname Time taken: 0.179 seconds, Fetched: 2 row(s) hive> select * from hive2hbase_idoall_middle; OK 12306 mname 10086 myidoall Time taken: 0.088 seconds, Fetched: 2 row(s) hive> |
4)、用shell连接hbase,查看hive过来的数据是否已经存在
root@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 11:47:14,454 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter ‘help<RETURN>‘ for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. hive2hbase_idoall test_idoall_org 2 row(s) in 2.9480 seconds => ["hive2hbase_idoall", "test_idoall_org"] hbase(main):002:0> scan "hive2hbase_idoall" ROW COLUMN+CELL 10086 column=cf1:val, timestamp=1406432660860, value=http://www.mamicode.com/myidoall 12306 column=cf1:val, timestamp=1406432660860, value=http://www.mamicode.com/mname 2 row(s) in 0.0540 seconds hbase(main):003:0> get "hive2hbase_idoall",‘12306‘ COLUMN CELL cf1:val timestamp=1406432660860, value=http://www.mamicode.com/mname 1 row(s) in 0.0110 seconds hbase(main):004:0> |
至此,hive to hbase的测试功能正常。
11、hbase to hive(Hbase中的表数据导入到Hive)
1)、在hbase下创建表hbase2hive_idoallroot@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 11:54:25,844 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter ‘help<RETURN>‘ for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> create ‘hbase2hive_idoall‘,‘gid‘,‘info‘ SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 0 row(s) in 3.4970 seconds => Hbase::Table - hbase2hive_idoall hbase(main):002:0> put ‘hbase2hive_idoall‘,‘3344520‘,‘info:time‘,‘20140704‘ 0 row(s) in 0.1020 seconds hbase(main):003:0> put ‘hbase2hive_idoall‘,‘3344520‘,‘info:address‘,‘HK‘ 0 row(s) in 0.0090 seconds hbase(main):004:0> scan ‘hbase2hive_idoall‘ ROW COLUMN+CELL 3344520 column=info:address, timestamp=1406433302317, value=http://www.mamicode.com/HK 3344520 column=info:time, timestamp=1406433297567, value=http://www.mamicode.com/20140704 1 row(s) in 0.0330 seconds hbase(main):005:0> |
root@m1:/home/hadoop# /home/hadoop/hive-0.13.1/bin/hive 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/07/27 11:57:20 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar:file:/home/hadoop/hive-0.13.1/lib/hive-common-0.13.1.jar!/hive-log4j.properties hive> show databases; OK default testidoall Time taken: 0.449 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.02 seconds hive> show tables; OK hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.026 seconds, Fetched: 3 row(s) hive> create external table hbase2hive_idoall (key string,gid map<string,string>)STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler‘ WITH SERDEPROPERTIES ("hbase.columns.mapping" ="info:") TBLPROPERTIES ("hbase.table.name" = "hbase2hive_idoall"); OK Time taken: 1.696 seconds hive> show tables; OK hbase2hive_idoall hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.034 seconds, Fetched: 4 row(s) hive> select * from hbase2hive_idoall; OK 3344520 {"address":"HK","time":"20140704"} Time taken: 0.701 seconds, Fetched: 1 row(s) hive> |
至此,如文章标题所描述的ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1分布式环境部署,全部测试完毕,过程中也遇到了一些坑,会在常见问题中介绍。希望这个测试笔记可以帮助到更多的人。
1)、hadoop在控制台输出debug信息,执行完以下命令后,可以启动namenode,datanode,yarn测试效果
export HADOOP_ROOT_LOGGER=DEBUG,console |
/home/hadoop/hive-0.13.1/bin/hive --hiveconf hive.root.logger=DEBUG,console |
rm /var/lib/mysql/ -R rm /etc/mysql/ -R apt-get autoremove mysql* —purge apt-get remove apparmor apt-get install mysql-server mysql-client mysql-common |
3、dpkg 被中断,您必须手工运行 sudo dpkg --configure -a解决此问题
sudo rm /var/lib/dpkg/updates/* sudo apt-get update sudo apt-get upgrade |
_00018 Hadoop-2.2.0 + Hbase-0.96.2 + Hive-0.13.1 分布式环境整合,Hadoop-2.X使用HA方式