首页 > 代码库 > Yarn Scheduler Load Simulator YARN调度负载模拟器

Yarn Scheduler Load Simulator YARN调度负载模拟器

    项目起源是因为有人希望有个模拟环境来模拟公平调度器和容量调度器,以便合理配置调度器,降低生产环境出问题的风险,详见https://issues.apache.org/jira/browse/YARN-1021。之后在hadoop2.3.0就增加了这个工具。

    首先设定环境变量:

     export HADOOP_HOME=/usr/hadoop-2.3.0

     export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop #此目录放置sls-runner.xml文件

     sls-runner.xml文件内容如下:

<configuration>

  <!-- SLSRunner configuration -->
  <property>
    <name>yarn.sls.runner.pool.size</name>
    <value>100</value>
  </property>
  
  <!-- Nodes configuration -->
  <property>
    <name>yarn.sls.nm.memory.mb</name>
    <value>10240</value>
  </property>
  <property>
    <name>yarn.sls.nm.vcores</name>
    <value>10</value>
  </property>
  <property>
    <name>yarn.sls.nm.heartbeat.interval.ms</name>
    <value>1000</value>
  </property>
  
  <!-- Apps configuration -->
  <property>
    <name>yarn.sls.am.heartbeat.interval.ms</name>
    <value>1000</value>
  </property>
  <property>
    <name>yarn.sls.am.type.mapreduce</name>
    <value>org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator</value>
  </property>

  <!-- Containers configuration -->
  <property>
    <name>yarn.sls.container.memory.mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.sls.container.vcores</name>
    <value>1</value>
  </property>

  <!--  metrics  -->
  <property>
    <name>yarn.sls.metrics.switch</name>
    <value>ON</value>
  </property>
  <property>
    <name>yarn.sls.metrics.web.address.port</name>
    <value>10001</value>
  </property>
  <property>
    <name>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</name>
    <value>org.apache.hadoop.yarn.sls.scheduler.FifoSchedulerMetrics</value>
  </property>
  <property>
    <name>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</name>
    <value>org.apache.hadoop.yarn.sls.scheduler.FairSchedulerMetrics</value>
  </property>
  <property>
    <name>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</name>
    <value>org.apache.hadoop.yarn.sls.scheduler.CapacitySchedulerMetrics</value>
  </property>
  
</configuration>
  • yarn.sls.runner.pool.size

    The simulator uses a thread pool to simulate the NM and AM running , and this parameter specifies the number of threads in the pool.

  • yarn.sls.nm.memory.mb

    The total memory for each NMSimulator.

  • yarn.sls.nm.vcores

    The total vCores for each NMSimulator.

  • yarn.sls.nm.heartbeat.interval.ms

    The heartbeat interval for each NMSimulator.

  • yarn.sls.am.heartbeat.interval.ms

    The heartbeat interval for each AMSimulator.

  • yarn.sls.am.type.mapreduce

    The AMSimulator implementation for MapReduce-like applications. Users can specify implementations for other type of applications.

  • yarn.sls.container.memory.mb

    The memory required for each container simulator.

  • yarn.sls.container.vcores

    The vCores required for each container simulator.

  • yarn.sls.runner.metrics.switch

    The simulator introduces Metrics to measure the behaviors of critical components and operations. This field specifies whether we open (ON) or close (OFF) the Metrics running.

  • yarn.sls.metrics.web.address.port

    The port used by simulator to provide real-time tracking. The default value is 10001.

  • org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler

    The implementation of scheduler metrics of Fifo Scheduler.

  • org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler

    The implementation of scheduler metrics of Fair Scheduler.

  • org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

    The implementation of scheduler metrics of Capacity Scheduler.

然后使用apache rumen对jobhistory文件进行解析,生成json格式文件以便sls解析:

hadoop jar hadoop-rumen-2.3.0.jar org.apache.hadoop.tools.rumen.TraceBuilder -write-job-trace file:///home/user/job-trace.json file:///home/user/topology.output file:///home/user/logs/history/done


file:///home/user/logs/history/done 用户集群存放运行完成jobhistory的目录,一般在hdfs里,可以通过hadoop fs -get取到本地目录

file:///home/user/job-trace.json file:///home/user/topology.output 生成的sls要读取的文件



运行模拟器

slsrun.sh --input-rumen=/home/user/  --output-dir=/usr/sls/sample-result


--input-rumen 本例就是file:///home/user/job-trace.json file:///home/user/topology.output 对应的路径


如果运行报错:

java.lang.NullPointerException
at org.apache.hadoop.yarn.sls.web.SLSWebApp.<init>(SLSWebApp.java:82)
at 

在https://issues.apache.org/jira/browse/YARN-1021中查看comment,解决方法如下:

bin/slsrun.sh --input-sls=sls-file/sls-jobs.json --output-dir=output_sls --nodes=sls-file/sls-nodes.json