首页 > 代码库 > Spark Standalone Mode

Spark Standalone Mode

<style></style>

It is very easy to install a Spark cluster (Standalone mode). In my example, I used three machines.

All machines run a OS of ubuntu 12.04 32bit. One machine is named "master", the other two are

named "node01" and "node02" respectively. The name of a machine can be set in:  /etc/hostname.

Further more, on every node (machine), I use the same user name.

 

1. On every node: Install Java and setJava environment in ~/.bashrc as:

  #set java environment

  exportJAVA_HOME=/usr/local/jdk1.7.0_67

  export JRE_HOME=$JAVA_HOME/jre

  export PATH=$JAVA_HOME/bin:$PATH

  exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

   Note that in my example, I usedJava jdk1.7.0_67 and put it under /usr/local.

2. On every node: Install Spark.

    Download any version of Spark fromhttp://spark.apache.org/downloads.html, in my example, I

    chose spark-1.1.0-bin-hadoop2.4.tgzand extract it to /usr/local.

3. Set up ssh such that every two nodesin the cluster can ssh each other without password. This step

    is also needed when you set up ahadoop cluster, there are abundant tutorials on the Internet, so

    the details is omitted here.

4. On every node:

  $ sudo vim /etc/hosts

    and set the IP address of the nodesin the network. For example, I set the hosts file on every node to:

  127.0.0.1        localhost

  223.3.86.xxx  master

  223.3.81.xxx  node01

  223.3.70.xxx  node02

5. On master node: Enter the rootfolder of Spark, and edit con/slaves. In my example:

  $ cd/usr/local/spark-1.1.0-bin-hadoop2.4

  $ sudo vim conf/slaves

     Edit slaves file to:

  master

  node01

  node02

6. On master node: Enter the rootfolder of Spark and start spark cluster.

  $ cd/usr/local/spark-1.1.0-bin-hadoop2.4

  $ sbin/start-all.sh

7. Open http://master:8080/using your web browser to monitoring the cluster.

Spark Standalone Mode