首页 > 代码库 > Learn ZYNQ(10) – zybo cluster word count

Learn ZYNQ(10) – zybo cluster word count

1.配置环境说明

spark:5台zybo板,192.168.1.1master,其它4台为slave

hadoop:192.168.1.1(外接SanDisk )

2.单节点hadoop测试:

如果出现内存不足情况如下:

EVE)V2DEK1)UL0[]NX2BX$H

查看当前虚拟内存容量:

free -m
cd /mnt
mkdir swap
cd swap/
创建一个swap文件
dd if=/dev/zero of=swapfile bs=1024 count=1000000
把生成的文件转换成swap文件
mkswap swapfile
激活swap文件
swapon swapfile
free -m

通过测试:

image

image

3.spark + hadoop 测试

SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.sh

image

MASTER=spark://192.168.1.1:7077 ./bin/pyspark

 

file = sc.textFile("hdfs://192.168.1.1:9000/in/file")
counts = file.flatMap(lambda line: line.split(" ")) \
             .map(lambda word: (word, 1)) \
             .reduceByKey(lambda a, b: a + b)
counts.saveAsTextFile("hdfs://192.168.1.1:9000/out/mycount")
counts.saveAsTextFile("/mnt/mycount")
counts.collect()

image

counts.collect()

image

错误1:

java.net.ConnectException: Call From zynq/192.168.1.1 to spark1:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

这是由于我们用root启动hadoop,而spark要远程操作hadoop系统,没有权限引起的

解决:如果是测试环境,可以取消hadoop hdfs的用户权限检查。打开etc/hadoop/hdfs-site.xml,找到dfs.permissions属性修改为false(默认为true)OK了。

<property>
        <name>dfs.permissions</name>
        <value>false</value>
</property>

 

4.附:我的配置文件

go.sh:

#! /bin/sh -mount /dev/sda1 /mnt/cd /mnt/swap/swapon swapfilefree -mcd /root/hadoop-2.4.0/sbin/hadoop-daemon.sh start namenodesbin/hadoop-daemon.sh start datanodesbin/hadoop-daemon.sh start secondarynamenodesbin/yarn-daemon.sh start resourcemanagersbin/yarn-daemon.sh start nodemanagersbin/mr-jobhistory-daemon.sh start historyserverjpswhile [ `netstat -ntlp | grep 9000` -eq `echo` ]dosleep 1donenetstat -ntlpecho hadoop start successfullycd /root/spark-0.9.1-bin-hadoop2SPARK_MASTER_IP=192.168.1.1 ./sbin/start-all.shjpswhile [ `netstat -ntlp | grep 7077` -eq `echo` ]dosleep 1done netstat -ntlpecho spark start successfully

/etc/hosts

#127.0.0.1      localhost       zynq192.168.1.1     spark1          localhost       zynq#192.168.1.1    spark1192.168.1.2     spark2192.168.1.3     spark3192.168.1.4     spark4192.168.1.5     spark5192.168.1.100   sparkMaster#::1            localhost ip6-localhost ip6-loopback

/etc/profile

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:$PATHexport JAVA_HOME=/usr/lib/jdk1.7.0_55export JRE_HOME=${JAVA_HOME}/jreexport CLASSPATH=.:$JAVA_HOME/lib/tools.jarexport PATH=$JAVA_HOME/bin:$PATHexport HADOOP_HOME=/root/hadoop-2.4.0export PATH=$PATH:$HADOOP_HOME/binexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"export HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopifconfig eth2 hw ether 00:0a:35:00:01:01ifconfig eth2 192.168.1.1/24 up

HADOOP_HOME/etc/hadoop/yarn-site.xml

<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property></configuration>

HADOOP_HOME/etc/hadoop/core-site.xml

<configuration>    <property>        <name>fs.default.name</name>        <value>hdfs://localhost:9000</value>    </property>    <property>        <name>hadoop.tmp.dir</name>        <value>/mnt/hadoop/tmp</value>    </property></configuration>

HADOOP_HOME/etc/hadoop/hdfs-site.xml

<configuration>    <property>        <name>dfs.replication</name>        <value>1</value>    </property>    <property>        <name>dfs.permissions</name>        <value>false</value>    </property>    <property>        <name>dfs.namenode.rpc-address</name>        <value>192.168.1.1:9000</value>    </property>    <property>        <name>dfs.datanode.data.dir</name>        <value>/mnt/datanode</value>    </property>    <property>        <name>dfs.namenode.name.dir</name>        <value>/mnt/namenode</value>    </property></configuration>

done