HDFS副本选择策略

首页 > 代码库 > HDFS副本选择策略

2024-08-05 15:31:46 221人阅读

　　在client向DataNode写入block之前，会与NameNode有一次通信，由NameNode来选择指定数目的DataNode来存放副本。具体的副本选择策略在BlockPlacementPolicy接口中，其子类实现是BlockPlacementPolicyDefault。该类中会有多个chooseTarget()方法重载，但最终调用了下面的方法：

 1 /** 2    * This is not part of the public API but is used by the unit tests. 3    */ 4   DatanodeDescriptor[] chooseTarget(int numOfReplicas, 5                                     DatanodeDescriptor writer, 6                                     List<DatanodeDescriptor> chosenNodes, 7                                     HashMap<Node, Node> excludedNodes, 8                                     long blocksize) { 9       //numOfReplicas：要选择的副本个数10       //clusterMap.getNumOfLeaves()：整个集群的DN个数11     if (numOfReplicas == 0 || clusterMap.getNumOfLeaves()==0) {12       return new DatanodeDescriptor[0];13     }14       15     //excludedNodes：排除的DN(因为有些DN已经被选中，所以不再选择他们)16     if (excludedNodes == null) {17       excludedNodes = new HashMap<Node, Node>();18     }19      20     int clusterSize = clusterMap.getNumOfLeaves();21     //总的副本个数=已选择的个数 + 指定的副本个数22     int totalNumOfReplicas = chosenNodes.size()+numOfReplicas;23     if (totalNumOfReplicas > clusterSize) {    //若总副本个数 > 整个集群的DN个数24       numOfReplicas -= (totalNumOfReplicas-clusterSize);25       totalNumOfReplicas = clusterSize;26     }27       28     //计算每个一个rack能有多少个DN被选中29     int maxNodesPerRack = 30       (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2;31       32     List<DatanodeDescriptor> results = 33       new ArrayList<DatanodeDescriptor>(chosenNodes);34     for (DatanodeDescriptor node:chosenNodes) {35       // add localMachine and related nodes to excludedNodes36       addToExcludedNodes(node, excludedNodes);37       adjustExcludedNodes(excludedNodes, node);38     }39       40     //客户端不是DN41     if (!clusterMap.contains(writer)) {42       writer=null;43     }44       45     boolean avoidStaleNodes = (stats != null && stats46         .shouldAvoidStaleDataNodesForWrite());47     48     //选择numOfReplicas个DN，并返回本地DN49     DatanodeDescriptor localNode = chooseTarget(numOfReplicas, writer,50         excludedNodes, blocksize, maxNodesPerRack, results, avoidStaleNodes);51       52     results.removeAll(chosenNodes);53       54     // sorting nodes to form a pipeline55     //将选中的DN(result中的元素)组织成pipe56     return getPipeline((writer==null)?localNode:writer,57                        results.toArray(new DatanodeDescriptor[results.size()]));58   }

　　方法含义大概就如注释中写的，不过要注意其中的变量含义。在第48行，又调用chooseTarget()方法来选择指定数目的DN(选中的DN存放在result中)，并返回一个DN作为本地DN。下面分析这个方法。

HDFS副本选择策略

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > HDFS副本选择策略

HDFS副本选择策略

看完仍有疑问？有类似问题直接问程序猿