Install Spark standalone cluster in EC2

Update: this blog is migrated to Medium https://medium.com/spark-experts. To continue access good content, please subscribe it.

  1. Create Amazon EC2 keypair
  2. Set environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY which can be obtained from AWS console under Account > Security Credentials > Access Credentials
  3. Download spark and extract it to a directory
  4. Go to ec2 folder inside spark directory and execute following command to launch new cluster named “Spark” with 1 master and 2 slave nodes of m3.medium type in Singapore region. For more options, ./spark-ec2 –help
  5. The installation will take few minutes and it will print the URL of the spark web UI at the end of the process.
  6. To SSH to spark master
    Once SSH to master, spark shell is available at /root/spark/bin/spark-shell

  7. To stop the cluster
  8. To start the cluster again
  9. To terminate the cluster permanently

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 194 other subscribers