Codementor Events

How to Install Hadoop 2.7 in Ubuntu

Published May 17, 2019

Hadoop is a Cluster data Management Project that is sponsored by Apache Software Foundation. It is a Java-based framework which allows managing the huge data sets among the group of cluster machines.

In the beginning, it seems tough to configure the cluster with Hadoop but you can even install Hadoop on a single macine to perform your basic operations.

Although Hadopp seems to be a single software, it has a lot of components begind it. Here, are some of them.

Hadoop Common - It is a big library consisting of utilities and libraries for supporting other Hadoop modules.

HDFS(Hadoop Distributed File System) - It is responsible for storing the data on the hard disk.

YARN - It is an open source distributed processing framework which stands for Yet Another Resource Negotiator.

MapReduce - It is a model for generating and processing big data sets in the cluster for using parallel and distributed algorithms.

In this article, we will learn to install and configure Hadoop 2.7x on Ubuntu OS. Follow the given steps to install Hadoop 2.7

Prerequisites
If you have Windows/Mac OS then try to install Hadoop 2.7 by creating a virtual machine and then install Ubuntu using VMWare player or create a virtual machine and install Ubuntu using Oracle Virtual Box.

Step I: Install Oracle Java version 8

1.Install the properties of Python Software
Hadoop 1.png

2.Insert a Repository
Hadoop 2.png

3.Update the source list
Hadoop 3.png

4.Install Oracle Java 8
Hadoop 4.png

Step II: Set an SSH without Passcode

1.Install Open SSH Client and Open SSH Server
Hadoop 5.png

2.Generate Private and Public Key Pairs
Hadoop 6.png

3.Configure your password-less SSH
Hadoop 7.png

4.Check to localhost by SSH
Hadoop 8.png

Step III: Configuration, Setup, and Installation of Hadoop

1.First, download Hadoop

https://archive.apache.org/dist/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.g

2.Untar Tarball
All the mandatory jars, scripts and configuration files are available in the HADOOP_HOME directory.
Hadoop 9.png

3.Configuration Setup
Edit .bashrc file
Add the following parameters in the .bashrc file in the user's home directory. All the environment variables will come into effect after proceeding the above step which will restart the terminal.
Hadoop 10.png

Edit Hadoop-env.sh
Edit the Hadoop-env.sh file located in etc/Hadoop inside Hadoop installation directory and then set JAVA_HOME:
Hadoop 11.png

Edit XML file (core-site.xml)
Edit the core-site.xml file located in etc/Hadoop inside Hadoop installation directory and then add the given entries:
Hadoop 12.png

Edit XML file (hdfs-site.xml)
Edit the core-site.xml file located in etc/Hadoop inside Hadoop installation directory and then add the given entries:
Hadoop 13.png

Edit XML file (mapred-site.xml)
Edit the mapred-site.xml file located in etc/Hadoop inside Hadoop installation directory and then add the given entries:
Hadoop 14.png

Edit XML file (yarn-site.xml)
Edit the yarn-site.xml file located in etc/Hadoop inside Hadoop installation directory and then add the given entries:
Hadoop 15.png

Step IV: Begin with the Cluster

1.Formatting the name node
It should be formatted only once when you install Hadoop.
Hadoop 16.png

2.Strat HDFS services.
Hadoop 17.png

3.Start YARN services.
Hadoop 18.png

4.Look up whether the services have been started.
Hadoop 19.png

Step V: Run Map-Reduce Jobs
Hadoop 20.png

Step VI: Stopping the Cluster

1.Stop HDFS services.
Hadoop 21.png

2.Stop YARN services.
Hadoop 22.png

Summing Up
Thus, we come to an end. This was all on the tutorial to install Hadoop 2.7 on Ubuntu in just 15 minutes. We would like to know your feedback on installing Hadoop on Ubuntu tutorial. Keep Learning!

Author Bio:
HP Morgan works as a Tech analyst at TatvaSoft.com.au, a customer software and Web development company in Australia. He loves to travel to natural places.

Discover and read more posts from Herman Morgan
get started
post commentsBe the first to share your opinion
Show more replies