First of all, I’m a total noob at hadoop and linux.I have a cluster of five nodes , which when starts shows a each node capacity only 46.6 GB while each machine had around 500 gb space which i dont know how to allocate to these nodes.
(1) Do I have to change the datanode and namenode file size(i checked these and it shows the same space remaining as in the Datanode information tab)? if so how should i do that.
(2)Also this 500gb disk is only shown when i do $lsblk command and not when i do $df -H command. Does that mean its not mounted? These are the results of the commands. Can someone explain what does this mean..
[hadoop@hdp1 hadoop]$ sudo lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 1024M 0 rom sda 8:0 0 50G 0 disk u251cu2500sda1 8:1 0 500M 0 part /boot u2514u2500sda2 8:2 0 49.5G 0 part u251cu2500VolGroup-lv_root (dm-0) 253:0 0 47.6G 0 lvm / u2514u2500VolGroup-lv_swap (dm-1) 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 512G 0 disk [hadoop@hdp1 hadoop]$ sudo df -H Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 51G 6.7G 41G 15% / tmpfs 17G 14M 17G 1% /dev/shm /dev/sda1 500M 163M 311M 35% /boot
Please help. Thanks in advance. First can someone help me understand why its showing different memory disks and what it means and where does it reside ?! I seem to not able to figure it out
Advertisement
Answer
You are right. Your second disk (sdb) is not mounted anywhere. If you are going to dedicate the whole disk to hadoop data, here is how you should format and mount it:
Format your disk:
mkfs.ext4 -m1 -O dir_index,extent,sparse_super /dev/sdb
For mounting edit the file /etc/fstab. Add this line:
/dev/sdb /hadoop/disk0 ext4 noatime 1 2
After that, create the directory /hadoop/disk0 (it doesn’t have to be named like that. you could use a directory of your choice).
mkdir -p /hadoop/disk0
Now you are ready to mount the disk:
mount -a
Finally, you should let hadoop know that you want to use this disk as hadoop storage. Your /etc/hadoop/conf/hdfs-site.xml should contain these config parameters
<property><name>dfs.name.dir</name><value>/hadoop/disk0/nn</value></property> <property><name>dfs.data.dir</name><value>/hadoop/disk0/dn</value></property>