Skip to content

Prerequisites

Objectives

  • Download a Hadoop virtual machine to use in VirtualBox
  • Configure the memory used by the virtual machine to not kill our machine
  • Start the virtual machine.

Setup the environment

Before working on the tutorial, we need a working Hadoop cluster. We are going to use:

  • VirtualBox to run the virtual machine.
  • Hortonworks Sandbox 2.5.0. It is less demanding in terms of resources than version 3+ and sufficient for this tutorial. Also version 3+ removes some minor features used through the tutorial like the Pig View.

Follow the given steps to import the machine in VirtualBox:

  • Download Hortonworks Sandbox 2.5.0 and unzip the appliance for VirtualBox.
  • Import the .ova file into VirtualBox. Don't start it yet if you want to configure it.
  • You may configure the VM to use more or less RAM depending on your machine, through the Configuration > System view. The recommended value is around 6-8 Go RAM, but you should get away with using 2-4 Go.
  • Start the virtual machine with the Start green arrow. This may take a few minutes.
  • If the virtual machine stops during startup, it is generally because you don't have enough resources. Try to open a process manager and kill some RAM-consuming processes, or lower the RAM needed by the virtual machine using the above step.
  • Open a web browser to http://localhost:8888 to be greeted with the Hortonworks Data Platform dashboard

Recap

  • We have imported the Hortonworks Data Platform into VirtualBox
  • We configured the RAM the Virtual Machine will use for the tutorial
  • We started the Virtual Machine and checked it runs correctly on our machine