Week8 Cloud
Week8 Cloud
Week -8
There are two ways to install Hadoop, i.e. Single node and Multi-node.
A single node cluster means only one DataNode running and setting up all the
NameNode, DataNode, ResourceManager, and NodeManager on a single machine.
This is used for studying and testing purposes. For example, let us consider a sample
data set inside the healthcare industry. So, for testing whether the Oozie jobs have
scheduled all the processes like collecting, aggregating, storing, and processing the
data in a proper sequence, we use a single node cluster. It can easily and efficiently
test the sequential workflow in a smaller environment as compared to large
environments which contain terabytes of data distributed across hundreds of
machines.
While in a Multi-node cluster, there are more than one DataNode running and each
DataNode is running on different machines. The multi-node cluster is practically used
in organizations for analyzing Big Data. Considering the above example, in real-time
when we deal with petabytes of data, it needs to be distributed across hundreds of
machines to be processed. Thus, here we use a multi-node cluster.
Prerequisites
Install Hadoop
Step 1: Click here to download the Java 8 Package. Save this file in your
home directory.
Command: wgethttps://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-
2.7.3.tar.gz
Step 5: Add the Hadoop and Java paths in the bash file (.bashrc).
Open. bashrc file. Now, add Hadoop and Java Path as shown below.
Learn more about the Hadoop Ecosystem and its tools with the Hadoop Certification.
Command: vi .bashrc
For applying all these changes to the current Terminal, execute the source command.
Command: cd hadoop-2.7.3/etc/hadoop/
Command: ls