Friends continuing with the advanced know-how and troubleshooting on glusterfs.In this article, we have a 3 node cluster running on glusterfs3.4. Below are the steps that are used for glusterfs troubleshooting.

Step 1: Check the Gluster volume status and information.

glusterfs status

[[email protected] ~]# gluster volume info

glusterfs info

Step 2: To verify all the details of the replication in Bricks.
The Below mentioned commands will show complete statistics of what data has been replicated and how much is to be replicated by checking on the size of the total disk space free.

Note: there is however a discrepancy of few MB’s in the size shown as free in the stats, this is due to the fact that application might have open connections to the files and due to this there is a variance in df and du value.

[[email protected] ~]# gluster volume status all detail

glusterfs details

Step 3: Now we need to have certain configuration to improve the performance and healing characterstics of the glusterfs.

# gluster volume set gluster cluster.min-free-disk 5%
# gluster volume set cluster.rebalance-stats on
# gluster volume set cluster.readdir-optimize on
# gluster volume set cluster.background-self-heal-count 20
# gluster volume set cluster.metadata-self-heal on
# gluster volume set cluster.data-self-heal on
# gluster volume set cluster.entry-self-heal: on
# gluster volume set cluster.self-heal-daemon on
# gluster volume set cluster.heal-timeout 500
# gluster volume set cluster.self-heal-window-size 2
# gluster volume set cluster.data-self-heal-algorithm diff
# gluster volume set cluster.eager-lock on
# gluster volume set cluster.quorum-type auto
# gluster volume set cluster.self-heal-readdir-size 2KB
# gluster volume set network.ping-timeout 5

Then run:

# service glusterd restart

After we have set the cluster properties we can check the volume information as shown below:

[[email protected] ~]# gluster volume info

glusterfs volume info

[[email protected] ~]# gluster volume status

glusterfs volume status

Please note that the Self-heal Daemon should be running on each system in the cluster since it’s responsible for healing if in case some node is down for some time from the cluster.

Step 4: Now to remove a machine gluster0 from cluster.

Unmount the Volume mounted on the gluster0 machine:

[[email protected] ~]# umount /mnt 
[[email protected] ~]# gluster volume remove-brick gluster replica 2 gluster0:/gluster0 commit

gluster volume info (to verify):

[[email protected] ~]# gluster volume info

glusterfs troubleshooting in Linux

On gluster1 run the following command:

# gluster peer detach gluster0

gluster0 server’s brick is removed from the cluster.