Question on md126 / md127 issues

Question on md126 / md127 issues

am 26.05.2011 06:10:35 von Dylan Distasio

Hi all-

I recently created a RAID1 2 disk mdadm array /dev/md1 with 1.2
metadata on a Ubuntu system that has 3 other mdadm arrays running on
it. The power went out at my house last night, and I rebooted the
system when it came back up.

When it came back up, my new array was in two pieces /dev/md126 and
/dev/md127 (with incorrect members, showing 1 active drive, 1 spare in
each). I rebooted again, and had what appeared to be my working
array, but showing up under /dev/md127. I could stop and do a --scan
to assemble it correctly as /dev/md1, but when I rebooted again I got
the same results with 126 and 127. My mdadm.conf was correct.

I did some searching on my archives of this list, and found a solution
as follows:
-----------
How to fix the '125/126/127' mdadm issue.

The array has '125' stored as the 'preferred minor' in the metadata.
You can change this by assembling with --update=super-minor.
e.g.

mdadm -S /dev/md125
mdadm -A /dev/md1 --update=super-minor

it should get details of which devices to included from /etc/mdadm.conf.

However it is possible that mdadm.conf in your initrd also the name
as /dev/md125.
So once you have performed the above, run mkinitrd again, reboot, and report
what happens.
----------------

I had to run the above commands, and then make sure I ran
update-initramfs -v -u for it to stick after reboot.

My issue is solved, but I would like to understand what the root cause
is, and why the above solution worked. Can someone elaborate on what
super-minor is? This is a home system and I had backups so I was
comfortable trying the above, but I don't typically like running
commands on faith I don't understand fully, especially in Linux.

Can anyone shed some light on this? I can provide further OS and
array details if necessary, but it sounds like this issue has occurred
for others in the past.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Question on md126 / md127 issues

am 27.05.2011 23:36:02 von NeilBrown

On Thu, 26 May 2011 00:10:35 -0400 Dylan Distasio wrote:

> Hi all-
>
> I recently created a RAID1 2 disk mdadm array /dev/md1 with 1.2
> metadata on a Ubuntu system that has 3 other mdadm arrays running on
> it. The power went out at my house last night, and I rebooted the
> system when it came back up.
>
> When it came back up, my new array was in two pieces /dev/md126 and
> /dev/md127 (with incorrect members, showing 1 active drive, 1 spare in
> each). I rebooted again, and had what appeared to be my working
> array, but showing up under /dev/md127. I could stop and do a --scan
> to assemble it correctly as /dev/md1, but when I rebooted again I got
> the same results with 126 and 127. My mdadm.conf was correct.

Very weird. It sounds like mdadm in the initrd is running before all devices
have been discovered, but even that shouldn't create two arrays....

Maybe the second device gets discovered after the switch to a real root and
something gets lost..


>
> I did some searching on my archives of this list, and found a solution
> as follows:
> -----------
> How to fix the '125/126/127' mdadm issue.
>
> The array has '125' stored as the 'preferred minor' in the metadata.

1.2 metadata doesn't have a 'perferred minor' - only 0.90 has that.

1.2 has a 'name' which has a vaguely similar purpose but I don't think it
would cause this sort of issue. I would be very surprised if '125' ever got
stored there unless you explicitly asked for it.

> You can change this by assembling with --update=super-minor.
> e.g.
>
> mdadm -S /dev/md125
> mdadm -A /dev/md1 --update=super-minor

This command will not affect a 1.2 array at all. It will just assemble it.

>
> it should get details of which devices to included from /etc/mdadm.conf.
>
> However it is possible that mdadm.conf in your initrd also the name
> as /dev/md125.
> So once you have performed the above, run mkinitrd again, reboot, and report
> what happens.
> ----------------

Running mkinitrd when you have boot problems is always a good idea. Maybe
that was all it took to fix your problem ??


>
> I had to run the above commands, and then make sure I ran
> update-initramfs -v -u for it to stick after reboot.
>
> My issue is solved, but I would like to understand what the root cause
> is, and why the above solution worked. Can someone elaborate on what
> super-minor is? This is a home system and I had backups so I was
> comfortable trying the above, but I don't typically like running
> commands on faith I don't understand fully, especially in Linux.
>
> Can anyone shed some light on this? I can provide further OS and
> array details if necessary, but it sounds like this issue has occurred
> for others in the past.

Now that the problem is fixed it is very hard to figure out what was happening
before. My best guess is that someone was wrong with mdadm.conf in the
initrd, but I don't know what.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Question on md126 / md127 issues

am 29.05.2011 00:15:54 von Dylan Distasio

Thanks, Neil.

I think running mkinitrd was probably the only thing required for a
fix after reading your response. I had an older array on the same box
that was completely removed, but maybe something was leftover in
initrd.

My detailed understanding of the initrd process is fairly limited. I
didn't realize there was a separate mdadm.conf that was used when
booting that is separate from the one in /etc.


> Running mkinitrd when you have boot problems is always a good idea. =A0=
Maybe
> that was all it took to fix your problem ??
>
>
>>
>> I had to run the above commands, and then make sure I ran
>> update-initramfs -v -u for it to stick after reboot.
>>
>> My issue is solved, but I would like to understand what the root cau=
se
>> is, and why the above solution worked. =A0Can someone elaborate on w=
hat
>> super-minor is? =A0This is a home system and I had backups so I was
>> comfortable trying the above, but I don't typically like running
>> commands on faith I don't understand fully, especially in Linux.
>>
>> Can anyone shed some light on this? =A0I can provide further OS and
>> array details if necessary, but it sounds like this issue has occurr=
ed
>> for others in the past.
>
> Now that the problem is fixed it is very hard to figure out what was =
happening
> before. =A0My best guess is that someone was wrong with mdadm.conf in=
the
> initrd, but I don't know what.
>
> NeilBrown
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Question on md126 / md127 issues

am 29.05.2011 00:47:59 von Phil Turmel

Hi Dylan,

On 05/28/2011 06:15 PM, Dylan Distasio wrote:
> Thanks, Neil.
>
> I think running mkinitrd was probably the only thing required for a
> fix after reading your response. I had an older array on the same box
> that was completely removed, but maybe something was leftover in
> initrd.
>
> My detailed understanding of the initrd process is fairly limited. I
> didn't realize there was a separate mdadm.conf that was used when
> booting that is separate from the one in /etc.

Many people miss this. Modern linux distributions, with few exceptions, use a three stage boot process: 1) kernel, 2) initramfs, then 3) real root FS. If there is no mdadm.conf in an initramfs at all, but the initramfs has raid support, mdadm will assemble everything it finds. It will assign the first array to md127 and count backwards from there.

You might like this description of the process from the kernel docs:

http://www.kernel.org/doc/Documentation/filesystems/ramfs-ro otfs-initramfs.txt

The money quote:

"An initramfs archive is a complete self-contained root filesystem for Linux."

If you change anything on your system that might impact the boot process, you're probably going to need to run "update-initramfs", or your distribution's equivalent.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html