hadoop的三种方式

首页 > 代码库 > hadoop的三种方式

2024-07-25 09:16:34 226人阅读

时间同步：date
火墙：iptables
解析：hosts

主机：192.168.2.149
节点：192.168.2.150
     192.168.2.125
     192.168.2.126

      rsync和ssh

网站：http://hadoop.apache.org/

有三种方式：单个节点
      尾部式（用于测试）
      完全分布式

～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～

1.基本安装及单个节点
    (1)下载 lftp i
            get hadoop-1.2.1.tar.gz jdk-6u32-linux-x64.bin
    (2)java：*sh jdk-6u32-linux-x64.bin
             结果：Java(TM) SE Development Kit 6 successfully installed.
                 Product Registration is FREE and includes many benefits:
                 * Notification of new versions, patches, and updates
                 * Special offers on Oracle products, services and training
                 * Access to early releases and documentation

                 Product and system data will be collected. If your configuration
                 supports a browser, the JDK Product Registration form will
                 be presented. If you do not register, none of this information
                 will be saved. You may also register your JDK later by
                 opening the register.html file (located in the JDK installation
                 directory) in a browser.

                 For more information on what data Registration collects and
                 how it is managed and used, see:
                 http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html

                 Press Enter to continue.....
                 Done.
             查看：ls
                 结果：hadoop-1.2.1.tar.gz jdk1.6.0_32 （多了此目录） jdk-6u32-linux-x64.bin
       安装：tar zfx hadoop-1.2.1.tar.gz
       移动：mv jdk1.6.0_32/ hadoop-1.2.1/jdk （为了方便一次性端走服务）
       链接：ln -s hadoop-1.2.1 hadoop （为了方便以后的更新）
   （3）配置 vim hadoop/conf/hadoop-env.sh
            内容：export JAVA_HOME=/root/hadoop/jdk（ 9行 java所在的目录）
   （4）测试
       1.输入 *mkdir hadoop/input
            *cp /root/hadoop/conf/*.xml /root/hadoop/input/
            *ls /root/hadoop/input/
            结果：capacity-scheduler.xml fair-scheduler.xml hdfs-site.xml          mapred-site.xml
                 core-site.xml           hadoop-policy.xml   mapred-queue-acls.xml
       2.输出 *cd /root/hadoop
            *bin/hadoop jar hadoop-examples-1.2.1.jar   （查看所支持的功能）
            结果：An example program must be given as the first argument.
                 Valid program names are:
                 aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
                 aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
                 dbcount: An example job that count the pageview counts from a database.
                 grep: A map/reduce program that counts the matches of a regex in the input. （过滤）
                 join: A job that effects a join over sorted, equally partitioned datasets
                 multifilewc: A job that counts words from several files.
                 pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
                 pi: A map/reduce program that estimates Pi using monte-carlo method.
                 randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
                 randomwriter: A map/reduce program that writes 10GB of random data per node.
                 secondarysort: An example defining a secondary sort to the reduce.
                 sleep: A job that sleeps at each map and reduce task.
                 sort: A map/reduce program that sorts the data written by the random writer.
                 sudoku: A sudoku solver.
                 teragen: Generate data for the terasort
                 terasort: Run the terasort
                 teravalidate: Checking results of terasort
                 wordcount: A map/reduce program that counts the words in the input files.

            *bin/hadoop jar hadoop-examples-1.2.1.jar grep input/ output ‘dfs[a-z.]+‘   （把input的有dfs开头的过滤到output里，自动建立output目录）
            结果：14/08/05 09:50:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
                 14/08/05 09:50:28 INFO mapred.JobClient:     Map output records=1

            *ls /root/hadoop/output/
            结果：part-00000 _SUCCESS

～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～
2.尾分布式文件系统

   /root/hadoop/bin/stop-all.sh

   (1)ssh （实现无密钥验证）
      *ssh-keygen （获取密钥，空格即可）
           结果：Generating public/private rsa key pair.
                Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase):
                Enter same passphrase again:
                Your identification has been saved in /root/.ssh/id_rsa.
                Your public key has been saved in /root/.ssh/id_rsa.pub.
                The key fingerprint is:
                07:f2:66:76:2b:59:76:29:c9:d9:b1:50:0f:5a:e9:2d root@server149.example.com
                The key‘s randomart image is:
                +--[ RSA 2048]----+
                |            +.   |
                |           +.o   |
                |      . . o.... |
                |       o o =E+. |
                |        S X =.   |
                |       + * +     |
                |        o .      |
                |         .       |
                |                 |
                +-----------------+
          *ssh-copy-id server149.example.com
           结果：The authenticity of host ‘server149.example.com (192.168.2.149)‘ can‘t be established.
                RSA key fingerprint is 66:9e:7c:f4:48:10:e3:20:59:1e:e9:44:35:32:42:14.
                Are you sure you want to continue connecting (yes/no)? yes                                      ***
                Warning: Permanently added ‘server149.example.com,192.168.2.149‘ (RSA) to the list of known hosts.
                root@server149.example.com‘s password:                                                   （输入密码）
                Now try logging into the machine, with "ssh ‘server149.example.com‘", and check in:
                .ssh/authorized_keys
                to make sure we haven‘t added extra keys that you weren‘t expecting.

          *ssh-copy-id localhost
          结果：The authenticity of host ‘localhost (::1)‘ can‘t be established.
               RSA key fingerprint is 66:9e:7c:f4:48:10:e3:20:59:1e:e9:44:35:32:42:14.
               Are you sure you want to continue connecting (yes/no)? yes                                ***
               Warning: Permanently added ‘localhost‘ (RSA) to the list of known hosts.
               Now try logging into the machine, with "ssh ‘localhost‘", and check in:
               .ssh/authorized_keys
               to make sure we haven‘t added extra keys that you weren‘t expecting.

          测试：ssh server149.example.com
              结果：Last login: Tue Aug 5 08:44:19 2014 from 192.168.2.1
          离开：logout
              结果：Connection to server149.example.com closed.

       网站：http://hadoop.apache.org/docs/r1.2.1/single_node_setup.html
   （2）配置文件 *vim /root/hadoop/conf/core-site.xml
               内容：<configuration>
                       <property>
                           <name>fs.default.name</name>
                                <value>hdfs://server149.example.com:9000</value>
                       </property>
                    </configuration>

               *vim /root/hadoop/conf/hdfs-site.xml
               内容：<configuration>
                       <property>
                          <name>dfs.replication</name>
                               <value>1</value>
                      </property>
                   </configuration>

               *vim /root/hadoop/conf/mapred-site.xml
               内容：<configuration>
                         <property>
                             <name>mapred.job.tracker</name>
                                 <value>server149.example.com:9001</value>
                        </property>
                    </configuration>

   （3）解析 vim /etc/hosts
             内容：192.168.2.149   server149.example.com

   （4）格式化并开启服务
        格式化： */root/hadoop/bin/hadoop namenode -format
            结果：14/08/05 10:22:08 INFO namenode.NameNode: STARTUP_MSG:
                /************************************************************
                STARTUP_MSG: Starting NameNode
                STARTUP_MSG:   host = server149.example.com/192.168.2.149
                STARTUP_MSG:   args = [-format]
                STARTUP_MSG:   version = 1.2.1
                STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by ‘mattf‘ on Mon Jul 22 15:23:09 PDT 2013
                STARTUP_MSG:   java = 1.6.0_32
                ************************************************************/
                14/08/05 10:22:08 INFO util.GSet: Computing capacity for map BlocksMap
                14/08/05 10:22:08 INFO util.GSet: VM type       = 64-bit
                14/08/05 10:22:08 INFO util.GSet: 2.0% max memory = 1013645312
                14/08/05 10:22:08 INFO util.GSet: capacity      = 2^21 = 2097152 entries
                14/08/05 10:22:08 INFO util.GSet: recommended=2097152, actual=2097152
                14/08/05 10:22:08 INFO namenode.FSNamesystem: fsOwner=root
                14/08/05 10:22:08 INFO namenode.FSNamesystem: supergroup=supergroup
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isPermissionEnabled=true
                14/08/05 10:22:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
                14/08/05 10:22:08 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
                14/08/05 10:22:08 INFO namenode.NameNode: Caching file names occuring more than 10 times
                14/08/05 10:22:09 INFO common.Storage: Image file /tmp/hadoop-root/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
                14/08/05 10:22:09 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
                14/08/05 10:22:09 INFO namenode.NameNode: SHUTDOWN_MSG:
                /************************************************************
                SHUTDOWN_MSG: Shutting down NameNode at server149.example.com/192.168.2.149
                ************************************************************
            */root/hadoop/bin/start-all.sh
            结果：namenode running as process 2504. Stop it first.
                 localhost: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server149.example.com.out
                 localhost: starting secondarynamenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-server149.example.com.out
                 jobtracker running as process 2674. Stop it first.
                 localhost: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server149.example.com.out

   （5）查看 /root/hadoop/jdk/bin/jps
            结果：3183 TaskTracker
                 2674 JobTracker
                 3035 SecondaryNameNode
                 2932 DataNode
                 2504 NameNode
                 3276 Jps

3.测试
（1）网页监控 http://192.168.2.149:50070/dfshealth.jsp
              http://192.168.2.149:50030/jobtracker.jsp
（2）测试    rm -fr /root/hadoop/output/
        *建目录：test
       /root/hadoop/bin/hadoop fs -mkdir test
         查看：/root/hadoop/bin/hadoop fs -ls
              结果：Found 1 items
                   drwxr-xr-x   - root supergroup          0 2014-08-05 11:13 /user/root/test
         *上传：传到test目录里
       /root/hadoop/bin/hadoop fs -put /root/hadoop/conf/*.xml test
         查看：/root/hadoop/bin/hadoop fs -ls test
              结果：Found 7 items
                   -rw-r--r--   1 root supergroup       7457 2014-08-05 11:15 /user/root/test/capacity-scheduler.xml
                   -rw-r--r--   1 root supergroup        348 2014-08-05 11:15 /user/root/test/core-site.xml
                   -rw-r--r--   1 root supergroup        327 2014-08-05 11:15 /user/root/test/fair-scheduler.xml
                   -rw-r--r--   1 root supergroup       4644 2014-08-05 11:15 /user/root/test/hadoop-policy.xml
                   -rw-r--r--   1 root supergroup        316 2014-08-05 11:15 /user/root/test/hdfs-site.xml
                   -rw-r--r--   1 root supergroup       2033 2014-08-05 11:15 /user/root/test/mapred-queue-acls.xml
                   -rw-r--r--   1 root supergroup        344 2014-08-05 11:15 /user/root/test/mapred-site.xml
        *输出：cd /root/hadoop
             bin/hadoop jar hadoop-examples-1.2.1.jar grep test output ‘dfs[a-z.]+‘
             结果：14/08/05 11:16:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
                  14/08/05 11:16:57 WARN snappy.LoadSnappy: Snappy native library not loaded
                  14/08/05 11:16:57 INFO mapred.FileInputFormat: Total input paths to process : 7
                  14/08/05 11:16:58 INFO mapred.JobClient: Running job: job_201408051022_0001
                  14/08/05 11:16:59 INFO mapred.JobClient: map 0% reduce 0% ......
                  14/08/05 11:23:34 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1445437440
                  14/08/05 11:23:34 INFO mapred.JobClient:     Map output records=2
        *查看：/root/hadoop/bin/hadoop fs -ls
             结果：Found 2 items
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:23 /user/root/output
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:15 /user/root/test

              /root/hadoop/bin/hadoop fs -cat output/*
              结果：1   dfs.replication
                   1   dfsadmin
                   cat: File does not exist: /user/root/output/_logs

         *下载：/root/hadoop/bin/hadoop fs -get output test （下载到本地）
              查看：ll -d /root/hadoop/test/
                  结果：drwxr-xr-x 3 root root 4096 Aug 5 11:27 /root/hadoop/test/

                  cat /root/hadoop/test/*
                  结果：cat: /root/hadoop/test/_logs: Is a directory
                      1   dfs.replication
                      1   dfsadmin
        *删除：rm -fr /root/hadoop/test/
              /root/hadoop/bin/hadoop fs -rmr output
              结果：Deleted hdfs://server149.example.com:9000/user/root/output
              查看：/root/hadoop/bin/hadoop fs -ls
                  结果：Found 1 items
                       drwxr-xr-x   - root supergroup          0 2014-08-05 11:15 /user/root/test

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                     master                      slives
HDFS                 namenode                    datanode
mp                   jobtracker                  tasktracter

mfs：数据存储
HDFS：存储与计算。

～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～

3.完全式（服务分开，添加节点）

1.主机（ 149 ）
（1）停服务 */root/hadoop/bin/stop-all.sh
              结果：stopping jobtracker
                   localhost: stopping tasktracker
                   stopping namenode
                   localhost: stopping datanode
                   localhost: stopping secondarynamenode
             */root/hadoop/jdk/bin/jps
             结果：8675 Jps
（2）解析   vim /etc/hosts （都需要）
             内容：192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
（3）ssh    ssh-copy-id server150.example.com
             ssh-copy-id server125.example.com

（4）配置文件vim /root/hadoop/conf/hdfs-site.xml
             内容：<value>2</value> （ 9 ）

             vim /root/hadoop/conf/masters
             内容：server149.example.com

             vim /root/hadoop/conf/slaves
             内容：server150.example.com
                  server125.example.com
（5）复制给添加的节点（添加的节点主机做链接）
             scp -r /root/hadoop-1.2.1 server150.example.com:
             scp -r /root/hadoop-1.2.1 server125.example.com:
（6）格式化 /root/hadoop/bin/hadoop namenode -format
            结果：14/08/05 10:22:08 INFO namenode.NameNode: STARTUP_MSG:
                /************************************************************
                STARTUP_MSG: Starting NameNode
                STARTUP_MSG:   host = server149.example.com/192.168.2.149
                STARTUP_MSG:   args = [-format]
                STARTUP_MSG:   version = 1.2.1
                STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by ‘mattf‘ on Mon Jul 22 15:23:09 PDT 2013
                STARTUP_MSG:   java = 1.6.0_32
                ************************************************************/
                14/08/05 10:22:08 INFO util.GSet: Computing capacity for map BlocksMap
                14/08/05 10:22:08 INFO util.GSet: VM type       = 64-bit
                14/08/05 10:22:08 INFO util.GSet: 2.0% max memory = 1013645312
                14/08/05 10:22:08 INFO util.GSet: capacity      = 2^21 = 2097152 entries
                14/08/05 10:22:08 INFO util.GSet: recommended=2097152, actual=2097152
                14/08/05 10:22:08 INFO namenode.FSNamesystem: fsOwner=root
                14/08/05 10:22:08 INFO namenode.FSNamesystem: supergroup=supergroup
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isPermissionEnabled=true
                14/08/05 10:22:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
                14/08/05 10:22:08 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
                14/08/05 10:22:08 INFO namenode.NameNode: Caching file names occuring more than 10 times
                14/08/05 10:22:09 INFO common.Storage: Image file /tmp/hadoop-root/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
                14/08/05 10:22:09 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
                14/08/05 10:22:09 INFO namenode.NameNode: SHUTDOWN_MSG:
                /************************************************************
                SHUTDOWN_MSG: Shutting down NameNode at server149.example.com/192.168.2.149
                ************************************************************

（7）开启服务/root/hadoop/bin/start-all.sh
             结果：starting namenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-namenode-server149.example.com.out
                  server125.example.com: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server125.example.com.out
                  server150.example.com: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server150.example.com.out
                  server149.example.com: starting secondarynamenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-server149.example.com.out
                  starting jobtracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-jobtracker-server149.example.com.out
                  server150.example.com: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server150.example.com.out
                  server125.example.com: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server125.example.com.out

（8）查看服务/root/hadoop/jdk/bin/jps
             结果：5721 JobTracker
                  5643 SecondaryNameNode
                  5832 Jps
                  5479 NameNode

（9）测试   */root/hadoop/bin/hadoop fs -put /root/hadoop/conf/ test
             */root/hadoop/bin/hadoop fs -ls
             结果：Found 1 items
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:54 /user/root/test
             */root/hadoop/bin/hadoop fs -ls test
             结果：Found 17 items
                  -rw-r--r--   2 root supergroup       7457 2014-08-05 11:54 /user/root/test/capacity-scheduler.xml
                 http://hadoop.apache.org/ -rw-r--r--   2 root supergroup       1095 2014-08-05 11:54 /user/root/test/configuration.xsl
                  -rw-r--r--   2 root supergroup        348 2014-08-05 11:54 /user/root/test/core-site.xml
                  -rw-r--r--   2 root supergroup        327 2014-08-05 11:54 /user/root/test/fair-scheduler.xml
                  -rw-r--r--   2 root supergroup       2428 2014-08-05 11:54 /user/root/test/hadoop-env.sh
                  -rw-r--r--   2 root supergroup       2052 2014-08-05 11:54 /user/root/test/hadoop-metrics2.properties
                  -rw-r--r--   2 root supergroup       4644 2014-08-05 11:54 /user/root/test/hadoop-policy.xml
                  -rw-r--r--   2 root supergroup        316 2014-08-05 11:54 /user/root/test/hdfs-site.xml
                  -rw-r--r--   2 root supergroup       5018 2014-08-05 11:54 /user/root/test/log4j.properties
                  -rw-r--r--   2 root supergroup       2033 2014-08-05 11:54 /user/root/test/mapred-queue-acls.xml
                  -rw-r--r--   2 root supergroup        344 2014-08-05 11:54 /user/root/test/mapred-site.xml
                  -rw-r--r--   2 root supergroup         22 2014-08-05 11:54 /user/root/test/masters
                  -rw-r--r--   2 root supergroup         44 2014-08-05 11:54 /user/root/test/slaves
                  -rw-r--r--   2 root supergroup       2042 2014-08-05 11:54 /user/root/test/ssl-client.xml.example
                  -rw-r--r--   2 root supergroup       1994 2014-08-05 11:54 /user/root/test/ssl-server.xml.example
                  -rw-r--r--   2 root supergroup       3890 2014-08-05 11:54 /user/root/test/task-log4j.properties
                  -rw-r--r--   2 root supergroup        382 2014-08-05 11:54 /user/root/test/taskcontroller.cfg
             */root/hadoop/bin/hadoop jar /root/hadoop/hadoop-examples-1.2.1.jar wordcount test output
             */root/hadoop/bin/hadoop fs -cat output/*

2.被添加的节点主机（ 150 ）
（1）解析   vim /etc/hosts （都需要）
             内容：192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
（2）链接     ln -s hadoop-1.2.1/ hadoop
               ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
（3）查看服务 /root/hadoop/jdk/bin/jps
              结果：1459 Jps
                   1368 TaskTracker
                   1300 DataNode

3.被添加的节点主机（ 125 ）
（1）解析   vim /etc/hosts （都需要）
             内容：192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
（2）链接     ln -s hadoop-1.2.1/ hadoop
               ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
（3）查看服务 /root/hadoop/jdk/bin/jps
              结果：1291 DataNode
                   1355 TaskTracker
                   1448 Jps

～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～
在线添加节点

当用的不是超级用户的时候：需要有一个存在的用户，uid必需是相同的，当时

1.主机：192.168.2.149
   （1）节点配置 vim /root/hadoop/conf/slaves
               内容：server126.example.com
   （2）解析    vim /etc/hosts
               内容：192.168.2.126   server126.example.com
                    192.168.2.149   server149.example.com
                    192.168.2.150   server150.example.com
                    192.168.2.125   server125.example.com
   （3）ssh     ssh-copy-id server126.example.com
               ssh server126.example.com （直接链接上不需要输入密码）
   （4）复制hadoop （复制完以后在添加的节点主机上做链接 2.1 2.2 ）
               scp -r /root/hadoop-1.2.1 server126.example.com:
   （5）大数据 *dd if=/dev/zero of=/root/hadoop/data1.file bs=1M count=500
               结果：500+0 records in
                    500+0 records out
                    524288000 bytes (524 MB) copied, 43.1638 s, 12.1 MB/s
              *dd if=/dev/zero of=/root/hadoop/data2.file bs=1M count=500
              *dd if=/dev/zero of=/root/hadoop/data3.file bs=1M count=500
   （6）上传数据（上传完数据以后在添加的节点主机上开启服务并查看数据 2.3 2.4）
               /root/hadoop/bin/hadoop fs -mkdir data
               /root/hadoop/bin/hadoop fs -put /root/hadoop/data{1,2,3}.file data
   （7）平衡数据 /root/hadoop/bin/start-balancer.sh
               结果：starting balancer, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-balancer-server149.example.com.out

2.添加的节点主机：192.168.2.126
（1）链接    ln -s hadoop-1.2.1/ hadoop
              ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
（2）解析    vim /etc/hosts
             内容：192.168.2.126   server126.example.com
                  192.168.2.149   server149.example.com
（3）开启服务 */root/hadoop/bin/hadoop-daemon.sh start datanode
              结果：starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server126.example.com.out

              */root/hadoop/bin/hadoop-daemon.sh start tasktracker
              结果：starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server126.example.com.out

              *jps
              结果：1714 TaskTracker
                   1783 Jps
                   1631 DataNode
（4）查看数据 /root/hadoop/bin/hadoop dfsadmin -report
              结果：Configured Capacity: 15568306176 (14.5 GB)
                   Present Capacity: 10721746944 (9.99 GB)
                   DFS Remaining: 7550484480 (7.03 GB)
                   DFS Used: 3171262464 (2.95 GB)
                   DFS Used%: 29.58%
                   Under replicated blocks: 0
                   Blocks with corrupt replicas: 0
                   Missing blocks: 0

                   -------------------------------------------------
                   Datanodes available: 3 (3 total, 0 dead)

                   Name: 192.168.2.125:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 1137459200 (1.06 GB)
                   Non DFS Used: 1615728640 (1.5 GB)
                   DFS Remaining: 2436247552(2.27 GB)
                   DFS Used%: 21.92%
                   DFS Remaining%: 46.95%
                   Last contact: Tue Aug 05 15:36:01 CST 2014

                   Name: 192.168.2.126:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 651132928 (620.97 MB)
                   Non DFS Used: 1615208448 (1.5 GB)
                   DFS Remaining: 2923094016(2.72 GB)
                   DFS Used%: 12.55%              *****
                   DFS Remaining%: 56.33%
                   Last contact: Tue Aug 05 15:36:01 CST 2014

                   Name: 192.168.2.150:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 1382670336 (1.29 GB)
                   Non DFS Used: 1615659008 (1.5 GB)
                   DFS Remaining: 2191106048(2.04 GB)
                   DFS Used%: 26.64%
                   DFS Remaining%: 42.22%
                   Last contact: Tue Aug 05 15:36:00 CST 2014

～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～～
在线删除节点
   （1）修改配置 *vim /root/hadoop/conf/mapred-site.xml
               内容：<configuration>
                      <property>
                          <name>mapred.job.tracker</name>
                               <value>server149.example.com:9001</value>
                      </property>

                      <property>         （添加此模块）
                          <name>dfs.hosts.exclude</name>
                               <value>/root/hadoop/conf/exclude-host</value>
                      </property>
                   </configuration>

               *vim /root/hadoop/conf/exclude-host
               内容：server150.example.com （要删除节点的主机）

   （2）刷新节点 /root/hadoop/bin/hadoop dfsadmin -refreshNodes
   （3）查看    /root/hadoop/bin/hadoop dfsadmin -report
               内容：Configured Capacity: 15568306176 (14.5 GB)
                    Present Capacity: 10594420391 (9.87 GB)
                    DFS Remaining: 7300513792 (6.8 GB)
                    DFS Used: 3293906599 (3.07 GB)
                    DFS Used%: 31.09%
Under replicated blocks: 20
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.2.125:50010
Decommission Status : Normal
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 1137459200 (1.06 GB)
Non DFS Used: 1674080256 (1.56 GB)
DFS Remaining: 2377895936(2.21 GB)
DFS Used%: 21.92%
DFS Remaining%: 45.82%
Last contact: Tue Aug 05 15:48:43 CST 2014

Name: 192.168.2.126:50010
Decommission Status : Normal
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 773777063 (737.93 MB)
Non DFS Used: 1684134233 (1.57 GB)
DFS Remaining: 2731524096(2.54 GB)
DFS Used%: 14.91%
DFS Remaining%: 52.64%
Last contact: Tue Aug 05 15:48:43 CST 2014

Name: 192.168.2.150:50010
Decommission Status : Decommission in progress   ***    （ Decommissioned ）
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 1382670336 (1.29 GB)
Non DFS Used: 1615671296 (1.5 GB)
DFS Remaining: 2191093760(2.04 GB)
DFS Used%: 26.64%
DFS Remaining%: 42.22%
Last contact: Tue Aug 05 15:48:42 CST 2014

hadoop的三种方式

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > hadoop的三种方式

hadoop的三种方式

看完仍有疑问？有类似问题直接问程序猿