Tag Archives: big-data
Apache Accumulo

Accumulo on Hortonworks Sandbox

Accumulo is not included in the Ambari installation so has to be manually installed. If you want to do some development with it the best starting place to get an instance up and running quickly is the Hortonworks Sandbox, however due to differences in installation procedures getting this working isn’t quite as straightforward as it could be.

Here are some notes on the procedure to help you on your way.

Prerequisites:

Download the Hortonworks Sandbox and start it in your virtual machine manager, I’m using VirtualBox here. Networking settings are quite important too, I set this to NAT so that the VM runs on a 10.0.2.0 network and management web pages are accessed on your host on the http://127.0.0.1/ address. This keeps everything simple and external repos can be accessed through the host internet connection.

Go to the Ambari management page, login is admin/admin and verify that the processes that we need are up and running. That will be HDFS, MapReduce2, YARN and Zookeeper; I also like to start the Ambari Metrics and collector so that I can see the activity but its not required.

Procedure:

  • Log in via ssh to the sandbox, login root/hadoop.
  • Accumulo is installed under (version numbers may differ), /usr/hdp/2.2.4.2-2/accumulo/
  • Copy a configuration example set to the root config directory, select a configuration according to your memory constraints but they should always be a standalone set. e.g.
  • Edit the file accumulo-env.sh and set the following variables accordingly.
    uncomment the line which reads:
  • Edit the file accumulo-site.xml and modify the value tags as below to hadoop, this is very important so that accumulo can interact with zookeeper.
  • Now we have to change the accumulo user properties, edit /etc/password and change:
    to Note that group 501 in this case is the hadoop group.
  • Create the home directory (need to su – hdfs to run the hadoop commands)
  • Change permissions and ownership
  • Now you are ready to initialize accumulo, this step writes the configuration information into zookeeper.
  • You should enter that instance name, which can be anything you like and the secret which must be hadoop
  • You are now ready to start accumulo


Congratulations, you have successfully installed and started accumulo. You can now monitor your instance at http://127.0.0.1:50095/

Accumulo Overview Page

Accumulo Overview Page

Troubleshooting:

If you see this exception during start-up:

This indicates that accumulo doesn’t have sufficient permissions to write into zookeeper. Check that you have configured all the file and user permissions correctly but above all verify that the secret in the accumulo-site.xml config file matches the value you entered at the init stage. It is perfectly safe to set this secret value again using:

You will be prompted for the original value and the new value that will get inserted into zookeeper.

Comments ( 0 )