You are here

mdadm

Recovering Linux RAID5 with mdadm

If a Linux box has hardware troubles and you temporarily loose a disk or two on a RAID5, you might get into a state where mdadm --assemble does not work. This might happen with a controller failure, or if you have faulty cabling.
You're seeing stuff like:

mdadm: failed to run array /dev/md7: Input/output error
md: pers->run() failed

Don't panic yet!
First step, ensure you have good backups and use dd or another tool to clone the hard disks.

What you need to do is recreate the RAID. This will work in most cases to get your data back, but need to be done carefully to ensure you don't destroy it in the process.

This doesn't really matter for RAID1, as your data is always consistent on both disks - you can put one disk in and resync everything from that to any other disk.

For RAID5, the data is held across all disks. The thing to realise here is that the order of the disks in the array really matters. The trick to recreating a RAID5 and having it work is to get the order right.

Problems:
1. What if the order is not obvious?
2. Resyncing.

If you add the disks in the wrong order and start the array in a working state, it will perform an initial sync of the array. This will destroy your data as RAID5 starts to write checksum data across it.
There may be a trick with mdadm to determine the correct order, but I do not know it (yet).

You must create the array in a degraded state with a disk missing. This will allow you to mount the disk, but will not cause a resync attempt.

So, here's the scenario. There are three disks in an array, hda, hdd, hdg. One failed completely (hdg) a while ago and you had to wait for new disks to be delivered. While waiting, there was an IDE failure and another disk was lost temporarily. Oh dear, we've got a broken array.

You bring the disk back online but the array won't auto-reassemble and mdadm --assemble isn't working. So we move on to recreating.

What you will do is attempt to create the array using two disks. The other will be marked as missing (even though we now have the replacement sitting on the workbench ready).
But we don't know what order the disks belong in the array - maybe we'll get lucky and they are alphabetical, maybe we won't.

This is what we'll do:

$ mdadm --create /dev/md7 --level=5 --raid-devices=3 -f /dev/hda1 /dev/hdd1 missing
$ cat /proc/mdstat
md7 : active raid5 hda1[2] hdd1[1]
240121472 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU]

So far so good, the RAID5 is running and has not touched the data by any resync attempt. So try to mount it readonly and see what happens.

It if worked, great. Backup your data and carry on your life. If not, stop the array try another order. Treat 'missing' like any other disk and move it around also. Perhaps get out a bit of paper and work out all the possible combinations to try.

$ mdadm -S /dev/md7
$ mdadm --create /dev/md7 --level=5 --raid-devices=3 -f /dev/hda1 missing /dev/hdd1

Keep repeating this until the array successfully mounts your file system.

When it has finally worked, and you've backed up, you can add your new disk back in. The array will resync your data across all three disks (or whatever number you have) and everything will be back to normal.

Linux soft RAID hanging on boot at Mounting Root

I have a Linux (Gentoo) server which has been somewhat unreliable, and suffers from frequent lockups[1]. Today, it started to hang at boot on "Mounting Root Filesystem".

I booted a recovery CD and took a look at the RAID filesystems, all using Linux's MD software RAID1. They all assembled fine, and mounted the ext3 and reiser3 filesystems without trouble. So I started to look in more detail:

On doing a query of one of the components of the root RAID, I found:

# mdadm -Q /dev/hda2
/dev/hda2: is not an md array
/dev/hda2: device 0 in 2 device mismatch raid1 /dev/md3. Use mdadm --examine for more detail.

"Mismatch!" All the others show "active" or "inactive". I look closer and note "md3" - my root is md1, /boot is md3!
What is happening is that the RAID block device notes in its superblock which md device node it is assigned to. When booting, Linux is looking for /dev/md3 to mount the root. Knowing this to an MD RAID, it examines devices and starts those that match.

In this case, I've probably made a mistake during a previous recovery and mounted / as md3, which it has remembered. So on bootup, I have two filesystems claiming to be for the root device, which is set as /dev/md3 in the LILO boot loader.

To fix this, you need to update the super block. This is done when assembling the device, so do it from a fresh boot off your recovery disk.

This is what I did:

# mdadm --assemble /dev/md3 --update=super-minor /dev/hda2 /dev/hdd2

Once done, a query shows:
# mdadm -Q /dev/hda2
/dev/hda2: is not an md array
/dev/hda2: device 0 in 2 device active raid1 /dev/md1. Use mdadm --examine for more detail.

Rebooting, the root is mounted instantly and everything works. Huzzah!

[1] Once every couple of days, and almost certainly temperature related as the environment has been getting very hot and humid at the same time. It has a hardware based watchdog which brings it back up - I do like real server hardware.. I pulled the heatsinks off the CPUs and noticed a lot of thermal transfer compound (which would be my fault) - I've wiped these down and left just a very thin film and will see how well it works now.

========

Update:
I noticed that the machine is running the disks on mdma2, rather than udma5.
So I played with the kernel options (2.6.22-r9) to try to fix that and on rebooting got the same problem again. Going back to kernel 2.6.21-r5 solved both the mounting root and UDMA issues. So I suspect the real reason behind all this is a broken kernel revision, at least with Broadcom CSB5 (Intel SDS2 board).

Keywords: 
Subscribe to RSS - mdadm