Replacing A Failed Hard Drive In A Software RAID1 Array
Before we start, in this case, the Faulty Hardisk are /dev/sdb
–> Pre Steps before starting <--
Determine the extent of the problem, execute a long selftest:
[root@hal ~]#> smartctl -t long /dev/sdb
Collect what is the Serial Number of broken HDD/ view test results:
[root@hal ~]#> smartctl -a /dev/sdb
Check if grub is install on the disk:
[root@hal ~]#> file -s /dev/sdb
Check health and status of RAID arrays
[root@hal ~]#> cat /proc/mdstat
df reports file system disk usage. It also works with RAID devices and can tell us how much free space we have left on our partitions.
[root@hal ~]#> df -lh | grep md
display status of currently used swap areas
[root@hal ~]#> swapon -s
———————————————————————–
How Do I Tell If A Hard Disk Has Failed?
If a disk has failed, you will probably find a lot of error messages in the log files, e.g. /var/log/messages or /var/log/syslog
———————————————————————–
–> Removing the HDD <--
STEP 1: Check the raid status before start
[root@hal ~]#> cat /proc/mdstat
STEP 2: mark Failed the disk (If already fail move to next steps)
[root@hal ~]#> mdadm --manage /dev/md0 --fail /dev/sdb1
STEP 3: Then we remove /dev/sdb1 from /dev/md0:
[root@hal ~]#> mdadm --manage /dev/md0 --remove /dev/sdb1
**Repeat steps two steps above until all partition has been remove**
STEP 4: Turn off swap before remove the broken HDD
[root@hal ~]#> swapoff
STEP 5: Then power down the system, and replace the old /dev/sdb hard drive with a new one (it must have at least the same size as the old one – if it’s only a few MB smaller than the old one then rebuilding the
arrays will fail).
[root@hal ~]#> shutdown -h now
–> Add the HDD <--
STEP 6: create the exact same partitioning as on /dev/sda. We can do this with one simple command:
[root@hal ~]#> sfdisk -d /dev/sda | sfdisk /dev/sdb
STEP 7: check if both hard drives have the same partitioning now
[root@hal ~]#> fdisk -l
STEP 8: Enable back the swap page
[root@hal ~]#> swapon
STEP 9: Next we add /dev/sdb1 to /dev/md0 and /dev/sdb2 to /dev/md1:
[root@hal ~]#> mdadm --manage /dev/md0 --add /dev/sdb1
**Repeat steps above until all partition has been added**
STEP 10: Run this command to see synchronization:
[root@hal ~]#> cat /proc/mdstat
——————————————————————-
STEP 11: Check boot-loader
The new disk is bootable, but GRUB is not yet installed:
[root@hal ~]# file -s /dev/sda
/dev/sda: x86 boot sector; partition 1: ID=0xfd, active, starthead 1,
startsector 63, 256977 sectors;partition 2: ID=0xfd, starthead 0,
startsector 257040, 976511025 sectors, code offset 0x48
[root@hal ~]# file -s /dev/sdb
/dev/sdb: x86 boot sector; partition 1: ID=0xfd, active, starthead 1,
startsector 63, 256977 sectors;partition 2: ID=0xfd, starthead 0,
startsector 257040, 1953263025 sectors
Install GRUB on the new disk, /dev/sdb (GRUB uses different names for your disks, /dev/sda => hd0, /dev/sdb => hd1 etc.):
[root@hal ~]# grub
<...snip...>
grub> root (hd1,0)
root (hd1,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd1)
setup (hd1)
Checking if "/boot/grub/stage1" exists... no
Checking if "/grub/stage1" exists... yes
Checking if "/grub/stage2" exists... yes
Checking if "/grub/e2fs_stage1_5" exists... yes
Running "embed /grub/e2fs_stage1_5 (hd1)"... 15 sectors are
embedded.
succeeded
Running "install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2
/grub/grub.conf"... succeeded
Done.
grub> quit
quit
***Another Method in Installing GRUB***
[root@hal ~]# grub-install /dev/sdb
Verify the result:
[root@hal ~]# file -s /dev/sdb
/dev/sdb: x86 boot sector; partition 1: ID=0xfd, active, starthead 1,
startsector 63, 256977 sectors;partition 2: ID=0xfd, starthead 0,
startsector 257040, 1953263025 sectors, code offset 0x48