Task — 7.1: Elasticity Task

In this article, I am going to discuss how we can integrate LVM with Hadoop to provide elasticity to the Data node storage and automate LVM script by Python.

4 min readMar 14, 2021

INTRODUCTION:

Hadoop: Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on.

Logical Volume Management : LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes.

So, what is logical volume?

With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.

The physical volumes are combined into logical volumes. Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.

The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3.

When “partitions” reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.

Step 1: Start the NameNode Service

In this step, we will start the service of NameNode

#hadoop-daemon.sh start namenode