Task — 7.1: Elasticity Task

In this article, I am going to discuss how we can integrate LVM with Hadoop to provide elasticity to the Data node storage and automate LVM script by Python.

Srasthy Chaudhary
4 min readMar 14, 2021

INTRODUCTION:

Hadoop: Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on.

Logical Volume Management : LVM is a tool for logical volume management which includes allocating disks, striping, mirroring and resizing logical volumes.

So, what is logical volume?

With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks.

The physical volumes are combined into logical volumes. Since a physical volume cannot span over multiple drives, to span over more than one drive, create one or more physical volumes per drive.

The volume groups can be divided into logical volumes, which are assigned mount points, such as /home and / and file system types, such as ext2 or ext3.

When “partitions” reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.

Step 1: Start the NameNode Service

In this step, we will start the service of NameNode

#hadoop-daemon.sh start namenode

Step 2: Add Hard Disk to the DataNode

Adding one new harddisk to the Datanode to share storage from this hard disk.

#fdisk -l

A new hard disk= 50GiB (Name /dev/sdb).

Step 3: Create Physical Volume from /dev/sdb.

#pvcreate /dev/sdb (Create pv)

#pvdisplay /dev/sdb (display pv)

Now, we have to allocate this physical volume to some Volume Group.

Step 4: Create the Volume Group

#vgcreate dnvg /dev/sdb (create vg and allocate pv to it).

#vgdisplay dnvg

Step 5: Create Logical Volume of Size 30GiB

#lvcreate — size 30G — name dnlv dnvg

Step 6: Format the Logical Volume

#mkfs.ext4 /dev/dnvg/dnlv

Step 7: Mount the Logical Volume with the DataNode directory

#mount /dev/dnvg/dnlv /dn

#df-h

Step 8: Start Datanode service

#hadoop-daemon.sh start datanode

#hadoop dfsadmin -report

Now we have to increase the storage online(elastically storage will increase without stopping the data node).

Step 9: Increase the Logical Volume Size

#lvextend — size +10G /dev/dnvg/dnlv

Logical volume Size has been increased from 30GiB to 40GiB.

The size of the volume which is mounted with /dn directory is still 30GiB.

Now, we have to update the inode table of the partition(logical volume).

Step 10: Format the extended Logical Volume

#resize2fs /dev/dnvg/dnlv

Python App for increasing the LVM dynamically

Python code:-

This is how we can automate LVM with Hadoop by using python-script.

Thank you!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response